At the moment if I want to view the gradients of a specific layer w.r.t each one of my inputs in the batch I loop through the gradient_weights in the loss.backward() method like so:
import torch
import torchvision
x = torch.rand((2,3,224,224))
y = torch.ones(2, dtype=torch.long)
m = torchvision.models.resnet18()
criterion_vec = torch.nn.CrossEntropyLoss(reduction='none')
optimizer = torch.optim.SGD(m.parameters(), 0.001)
m.train()
out = m(x)
b = out.shape[0]
grads = []
for i in range(b):
idx = torch.zeros(b)
idx[i] = 1
loss = criterion_vec(out, y)
optimizer.zero_grad()
loss.backward(torch.FloatTensor(idx), retain_graph=True)
g = m.conv1.weight.grad[0][0][0]
print(g)
m.conv1.weight.grad.data.zero_()
I was wondering if there was an easier way to get the gradients w.r.t each one of the inputs in the batch. Or is this reduction on the batch-axis forced by cuDNN?