None type for gradient?

Hi! I’m having trouble understanding why I’m getting nonetype for my “resampled” function right after I call loss.backward().

Below is the code:

def problem_l2loss(data,target):

  not_done = True
  learning_rate=0.07
  count = 1
  cd = torch.tensor([0,0]).type(torch.FloatTensor)
  cd = torch.unsqueeze(cd,dim=1).cuda()
  scale_a = torch.unsqueeze(torch.tensor([.5]).type(torch.FloatTensor),dim=1).cuda()
  cd.requires_grad_()
  scale_a.requires_grad_()
  
  while(not_done):

    one_zero = torch.tensor([1,0]).cuda()
    zero_one = torch.tensor([0,1]).cuda()
    temp_a = one_zero*scale_a
    temp_b = zero_one*scale_a
    two_by_two = torch.cat((temp_a,temp_b),dim=0)   

    M= torch.cat((torch.tensor(two_by_two),cd),dim=1)[None,:,:]
    #The None,:,: is for the batch size
    pdb.set_trace()
    
    grid = F.affine_grid(M,data.shape)
    grid.requires_grad_()
    resampled = F.grid_sample(data.cuda(), grid, mode='bilinear')[0]

    criterion = torch.nn.MSELoss()

    loss = -1*criterion(resampled, batch_cat_alone.cuda())
    loss.backward()
    #why is the gradient of resampled None here?

    scale_a_grad = scale_a.grad.data
    cd_grad = cd.grad.data

    scale_a.data = scale_a.data + learning_rate*scale_a_grad
    cd_grad.data = cd_grad.data + learning_rate*cd_grad_grad

    count = count + 1
  
    if count == 100:
      print("here")
      not_done = False
    
  return 

Further comments: I made sure to send everything to cuda before saying that I wanted gradient_true() on all my variables. I also checked, and if I multiply two variables where one has gradient_true and the other I didn’t explicitly set, the product has gradient_true. Also, “resampled” has resampled.requires_grad = True, yet it has none type for resampled.grad … Obviously scale_a and cd also have None type, despite both also having requires_grad = True when I check them after loss.backward() with a pdb trace… Thanks!

By default, only leaves, but not intermediate results get gradients. You can do resampled.retain_grad() to get intermediate gradients.

Best regards

Thomas

1 Like