My question relates this one.
#Quote Mr SimonW
All autograd does is just to calculate the gradient, it has no notion of batching, and I don’t see how it can have different behavior with different batching mechanism.
My question relates this one.
#Quote Mr SimonW
All autograd does is just to calculate the gradient, it has no notion of batching, and I don’t see how it can have different behavior with different batching mechanism.