.grad is none after 1 round of loss.backward()

I have the following code:

(pimg is PIL.Image)

def fgsm(src_img_file, target, model):

toTens = transforms.ToTensor()
PIL_image = pimg.open(src_img_file)
img = toTens(PIL_image)
img.unsqueeze_(0) # torch.nn takes only minibatches, so add fake batch dim
img.requires_grad_(True)

criterion = nn.CrossEntropyLoss()
target_tensor = torch.tensor([target])

model.eval()
epsilon = 0.001
for epochs in range(1, 100):
    pred = model(img)
    print ("model pred", pred.max(1)[1])
    loss = criterion(pred, target_tensor)
    loss.backward()
    img = torch.clamp(img + epsilon * torch.sign(img.grad), 0.0, 1.0) # make sure it is a valid image

after 1 iteration, I get the following error message:
Traceback (most recent call last):
File “run_mnist.py”, line 161, in
fgsm(src_img, tgt, model)
File “run_mnist.py”, line 120, in fgsm
img = torch.clamp(img + epsilon * torch.sign(img.grad), 0.0, 1.0) # make sure it is a valid image
TypeError: sign(): argument ‘input’ (position 1) must be Tensor, not NoneType

Why does this happen? How can I fix it?

img is no more a leaf variable since you unsqueeze it. Gradients are kept only for leaf variables. You can correct the code like so:

image = torch.Tensor(1,28,28)
image.requires_grad_(True)
image_unsqueezed = image.unsqueeze(0)
pred = model(image_unsqueezed)
target = 3
target_tensor = torch.LongTensor([target])
loss = criterion(pred, target_tensor)
loss.backward()

or using variables explicitly:

image = torch.Tensor(1,28,28)
image_var = Variable(image.unsqueeze(0), requires_grad=True)
pred = model(image_var)
target = 3
target_tensor = Variable(torch.LongTensor([target]))
loss = criterion(pred, target_tensor)
loss.backward()
print(image_var.grad)
1 Like

thanks for the reply. I tried wrapping the result of unsqueeze in a var, but that did not work. I checked the recent doc update, and Var is now deprecated, so I simply tried using a new tensor with requires_grad set to true. I still get the same error. Here is what I have:

def i_fgsm(src_img_file, target, model):

toTens = transforms.ToTensor()
PIL_image = pimg.open(src_img_file)
_img = toTens(PIL_image)
_img.unsqueeze_(0) # torch.nn takes only minibatches, so add fake batch dim
img = torch.tensor(_img.numpy(), requires_grad=True)
print (img.shape)

criterion = nn.CrossEntropyLoss()
target_tensor = torch.LongTensor([target])

model.eval()
epsilon = 0.01

for epochs in range(1, 100):
    pred = model(img)
    print ("model pred", pred.max(1)[1])
    loss = criterion(pred, target_tensor)
    loss.backward()

    img = torch.clamp(img + epsilon * torch.sign(img.grad), 0.0, 1.0) # make sure it is a valid image

I’m new to the new syntax, I updated the post. See the first block of code

I figure it out. The last line has to assign

img.data = torch.clamp(img + epsilon * torch.sign(img.grad.data), 0.0, 1.0) # make sure it is a valid image

not what I had originally

img = torch.clamp(img + epsilon * torch.sign(img.grad), 0.0, 1.0) # make sure it is a valid image

I’m not sure what this difference means… It seems like rather than creating a new tensor every loop, I simply change the data on the existing tensor… But I’m not sure. Would appreciate a clear explanation on what .data is of a tensor. Does it simply refer to the underlying values inside the tensor?

data is the data (tensor) underlying a variable. If you make operations on a variable, you are generating new terms to the computational graph that describe its expression. For instance, if you update x by multiplying it by itself, the computational graph becomes x^2 and the gradient is 2x). By using data you are instead updating its value. For instance, if the computational graph is 3x, and you set x.data=4, the gradient is still three).

You were getting gradient equal to none because your img is not the variable mapping the original image anymore. Because of efficiency, the gradient values are kept only for the leaf variables.

1 Like