Non scalar backward and self mini batch implementation

meipuru344 · December 8, 2018, 2:08pm

My question relates this one.

#Quote Mr SimonW
All autograd does is just to calculate the gradient, it has no notion of batching, and I don’t see how it can have different behavior with different batching mechanism.