I was wondering is there a way to get a separate gradient for each data point? The default setting in that it sums up all the gradients in var.grad, but I would like to have the grad values before summing them up. In the next example, I have two data point, and I need to have two sets of gradients corresponding to each row of the batch.
class MnistFC(nn.Module): def __init__(self): super(MnistFC, self).__init__() self.fc = nn.Linear(28*28, 10) def forward(self, x): x = x.view(-1, 28*28) x = self.fc (x) return x net = MnistFC() x = torch.randn([2,784]) torch.manual_seed(47) out = net(x) torch.autograd.grad(out,net.parameters(),torch.ones([2,10]), retain_graph=True)
For better clarification, I want to have net.fc.weight.grad with shape [2x784x10] instead of [784x10]. It is also ok to have a list with size two, each element with shape [784x10]. Either way is fine. One way to do this is with a for loop over every single line of data, but this method is not efficient. I am looking for an efficient way to create grads with just one autograd.grad call.