I want to get gradient of FC layer of network with respect to each sample. FC is last linear (dense) layer.
With batch size of 1, I can use following code
loss_CE = torch.nn.CrossEntropyLoss().cuda() for i, (x,y) in enumerate(train_loader, 0): x = x.cuda() inputs = Variable(x, requires_grad = True) FV, Logit = model(inputs) FV = Variable(FV, requires_grad = True) m, y_hat = torch.max(Logit, dim = 1) loss = loss_CE(Logit,y_hat) loss.backward() grad = model.fc.weight.grad
For ImagNet and Resnet-50 it produce a tensor of 1000 x 2048
It take a lot of time if I run on all images with batch size of 1.
If I increase the batch size, output of above code is still 1000 x 2048.
How can I modify to work in batch? output tensor size should be 256 x 1000 x 2048 when batch size is 256