Apologies if this has been asked to death, but searching didn’t find anything I could use.
I can use the reduction keyword in a loss function and it will output the specific losses for each sample of my minibatch as a vector. But I’m testing a new algorithm that requires the individual gradients for each of the minibatch terms. Currently running backward() on the loss will only give me the sum over terms, and I cannot figure out how to get a tensor with all of the components in.
So for example, running for a minibatch of 1000 entries:
lf = nn.CrossEntropyLoss(reduction='none') loss = lf( data , labels ) print loss.shape
as it should. I can then run e.g.
and it will deposit the gradient d[loss_0] / dx into x.grad as expected (although this comes at the same cost as computing the gradient of all 1000 entries), assuming I’ve zeroed the grad before.
Is there an efficient way to recover each d[loss_i] / dx for all i in the minibatch, without getting the sum?
Thanks for reading the post - let me know if anything is unclear.