Hi, sorry newbie here, I’m trying to understand how to do gradient checking with PyTorch. I’m using the mnist example as a reference ( https://github.com/pytorch/examples/blob/master/mnist/main.py ). So first I wanted to get the input analytic gradients, which I think can be achieved by changing the following lines like this:
Line 81: data, target = Variable(data), Variable(target) --> data, target = Variable(data, requires_grad=True), Variable(target)
Insert into somewhere between lines 81-84: data.register_hook(print)
Then to get the numerical gradients I created a function like so:
def numerical_grad(input_, target, row_idx, col_idx):
model.eval()
input_shp = input_.size()
E = torch.zeros(input_shp)
if args.cuda:
E = E.cuda()
eps = 0.001
E[0][0][row_idx][col_idx] = eps
E = Variable(E)
M1 = input_ + E
M2 = input_ - E
out1 = model(M1)
out2 = model(M2)
l1 = F.nll_loss(out1, target)
l2 = F.nll_loss(out2, target)
grad = (l1 - l2)/(2*eps)
return grad
I assumed this would give me the numerical gradient of the input at the specified row and column index, but it’s way way off. Am I doing something wrong? Thanks