Manual weight updates: autograd returns none, why?

qm-intel · March 16, 2023, 3:11pm

I am new to pytorch and it may sound a simple question, sorry for that.
I have written a function that updates the parameters of a network manually:

    def update_params(self, loss, update_lr):
        # parameter update
        updated_params = OrderedDict()
        for name, param in self.graph_model.gnn.named_parameters():
            if param.requires_grad:
                grad = torch.autograd.grad(loss, param, create_graph=True,  allow_unused=True)
                if grad is None:
                    updated_params = param
                else:
                    pdb.set_trace()
                    updated_params = param - update_lr * grad
            updated_params[name] = updated_params
        return updated_params

the loss is as follows:

loss
tensor([0.0693], device='cuda:0', grad_fn=<AddBackward0>)

and the first param of loop is :

(Pdb) param
Parameter containing:
tensor([[-0.2142, -0.1182, -0.2988,  ...,  0.2933, -0.0804, -0.3286],
        [-0.1250,  0.2673,  0.1617,  ...,  0.2363,  0.2026, -0.2973],
        [ 0.0588,  0.2348, -0.2333,  ...,  0.1882,  0.0286, -0.3238],
        ...,
        [-0.1961,  0.1434,  0.0306,  ...,  0.3135,  0.2239, -0.0953],
        [ 0.1190,  0.2062, -0.2643,  ...,  0.3116,  0.1146, -0.1994],
        [ 0.0340, -0.2294,  0.2095,  ..., -0.2376,  0.0456,  0.3151]],
       device='cuda:0', requires_grad=True)

the grad is none for the first param (first iteration in the loop)

(pdb) grad
(None,)

however, when I check

(Pdb) grad is None
False

it returns False (means goes to the else). I am not sure where I am doing the mistake?

soulitzer · March 16, 2023, 9:38pm

torch.autograd.grad returns Tuple[None] which is not the same as None!

qm-intel · March 19, 2023, 4:18am

@soulitzer Sorry I should have asked my question more clearly,
the first question was, why autograd.grad returns None?

the second question that you answered relates to the tuple you mentioned.

So the question is why it returns None? what are the reasons? and how can I resolve this?

soulitzer · March 20, 2023, 6:15pm

Usually autograd would’ve raised an error here, saying that some of the inputs that you are trying to find gradients with respect to aren’t part of the graph that autograd built, but since you explicitly passed allow_unused=True, it would just return None for that input instead.

This can mean several things: (1) you did not use that input in your computation at all (2) you did use that input in the computation, but the gradient wrt that input is zero. The issue here is that autograd doesn’t actually distinguish between gradient values of zero and gradient of None (3) you are using non-differentiable operations. There are certain operations that can break the autograd graph, such as constructing a new-tensor, converting to and from numpy (perhaps to use other scientific libraries)

qm-intel · March 21, 2023, 2:45am

@soulitzer Thanks fo the explanation