So here is the setup that I have some questions regarding. Suppose we have a model M and we set the flag requires_grad = False. And lets instantiate a tensor X = torch.Tensor(input_shape) and set its flag X.requires_grad = True. Also lets create an optimizer optimX = optim.Adam(X). Suppose we have a function F which takes in M(X) and some data X_data and we run F(M(X),X_data).backward. Here the gradients for X are computed. Now normally for a model gradients are accumulated, but in this case our X is just a simple tensor. In the next run I would like to clear the gradients, but since X is just a tensor the method zero_grad is not an option. In my case are the gradients for X accumulated, if so how do I clear the gradients for the next run?

Nevermind I figured it out. The gradients are accumulated and stored in X.grad. If I want to clear it I can just set it to None.

1 Like