Need Not Summing Up Gradients

Mohammadreza_Nazari · April 24, 2019, 1:08am

I was wondering is there a way to get a separate gradient for each data point? The default setting in that it sums up all the gradients in var.grad, but I would like to have the grad values before summing them up. In the next example, I have two data point, and I need to have two sets of gradients corresponding to each row of the batch.

class MnistFC(nn.Module):
  def __init__(self):
    super(MnistFC, self).__init__()
    self.fc  = nn.Linear(28*28, 10)

  def forward(self, x):
    x = x.view(-1, 28*28)
    x = self.fc (x)
    return x
net = MnistFC()
x = torch.randn([2,784])

torch.manual_seed(47)
out = net(x)
torch.autograd.grad(out,net.parameters(),torch.ones([2,10]), retain_graph=True)

For better clarification, I want to have net.fc.weight.grad with shape [2x784x10] instead of [784x10]. It is also ok to have a list with size two, each element with shape [784x10]. Either way is fine. One way to do this is with a for loop over every single line of data, but this method is not efficient. I am looking for an efficient way to create grads with just one autograd.grad call.

Thanks,
Reza

MariosOreo · April 25, 2019, 1:57am

Hello Reza,

How about change argument reduction in the loss function you employed?

Mohammadreza_Nazari · April 25, 2019, 1:50pm

No, it does not solve my issue. The reduce parameter only tells that the output of loss function need not be averaged. It does nothing with the gradients. I need to have a separate gradient for every data point.

Thanks.