Getting individual backward() info on a loss function with (reduction='none')

NiceDesk · September 11, 2018, 7:45pm

Hi all,

Apologies if this has been asked to death, but searching didn’t find anything I could use.

I can use the reduction keyword in a loss function and it will output the specific losses for each sample of my minibatch as a vector. But I’m testing a new algorithm that requires the individual gradients for each of the minibatch terms. Currently running backward() on the loss will only give me the sum over terms, and I cannot figure out how to get a tensor with all of the components in.

So for example, running for a minibatch of 1000 entries:

lf = nn.CrossEntropyLoss(reduction='none')
loss = lf( data , labels )
print loss.shape

gives

(1000,)

as it should. I can then run e.g.

loss[0].backward()

and it will deposit the gradient d[loss_0] / dx into x[0].grad as expected (although this comes at the same cost as computing the gradient of all 1000 entries), assuming I’ve zeroed the grad before.
Is there an efficient way to recover each d[loss_i] / dx for all i in the minibatch, without getting the sum?

Thanks for reading the post - let me know if anything is unclear.

tom · September 12, 2018, 7:11am

I think you want something like discussed in this thread (see also the bug report):

It’s not something that you would change at the “loss end” of the computational graph, but at the parameter end.

Best regards

Thomas