Manually calculate gradients for model parameters using autograd.grad()

I want to do this

grads = grad(loss, model.parameters())

But I am using nn.Module to define my model. It runs backward() function automatically but I want backward() function not to calculate any grads and want to compute grads myself.

How can I omit backward() function and can prevent to perform any gradient calculation when I call my model?

I was wondering if it were at all possible to implement a backward of backward method in an autograd function?

Hi, plain nn.Module don’t call backward by themselves.
You might need to update the library you’re using on top of pytorch to stop doing that.
Then you can use autograd.grad() to do what you want.

I am using version 1.6 of pytorch. And I am getting this error for using grad manually

“RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling backward the first time.”

As mentioned, if you want to backward through the graph multiple times, you have to specify retain_graph=True for all but the last call.
Note that this can happen as well if you do some computations before your training loop and you re-use part of the graph for all the iterations. you should recompute all the graph at every iteration!

So here is what I am doing

  1. Defined the model class (myModel) using nn.Module
    Now in training
  2. called my model as outputs = myModel(inputs)
  3. calculating loss as loss = Crossentropy(outputs, targets)
  4. Now I am not calling loss.backward(). I use: grads = grad(loss, model.paramaters())

I am getting the error that I mentioned above which means that somehow it is calling backward() function already. I want to prevent that automatic first call of backward().


There is no automatic call to backward. And this can also be a previous call to autograd.grad.

Can you share the code around your main training loop. Or a small code sample (30-40 lines) that can reproduce the error if you can do so?

I am using LSTM for PENN TreeBank problem.
Also I have one doubt that came across mind. For LSTMs, do I have to initialize hidden states in every loop or just at the start?

def train_loss(data, target, ht, ct):
    out, (ht, ct) = myModel(data, ht, ct)
    loss = F.cross_entropy(out, target)
    return loss, (ht, ct)
for epoch in range(EPOCHS):
    trainloss = 0.0
    t0 = time.time()
    for batch in train_loader:
        data, target = batch.text.t(),, 
        loss, (ht, ct)  = train_loss(data, target.reshape(-1), ht, ct)

        grads = grad(loss, Ws)
        trainloss += loss

You do have to either re-initialize it or at least .detach() it. Otherwise, you will try to backprop through all the previous iterations of your model.
And this causes the error you see as you already backproped in that model before.

1 Like

Yes, Right

The problem is solved by using

ht = ht.clone().detach()
ct = ct.clone().detach()

solved the problem.
Thank you.

1 Like