How to use torch.autograd.grad together with torch.no_grad() in model forward?

Hello there!

I want to compute the gradients of a model that has a torch.no_grad operation in the forward function.
The resulting grad in this example is None although x.requires_grad is True when I debug the forward.

Here is an example script:

import torch
import torch.nn as nn

class Network(nn.Module):
    def __init__(self, in_dim=10, out_dim=1):
        super(Network, self).__init__()
        
        self.instancenorm   = nn.InstanceNorm1d(in_dim)
        self.fc = nn.Linear(in_dim, out_dim)

    def forward(self, x):
        
        with torch.no_grad():
            x = self.instancenorm(x)

        x = self.fc(x)
       
        return x


model = Network(in_dim=10, out_dim=1)
model.eval()

x = torch.rand(1,10)
x.requires_grad_(True)

score = model(x)


grad = torch.autograd.grad(
                outputs=score,
                inputs=x,
                allow_unused=True)[0]

print(grad)

When I remove

with torch.no_grad(): ...

it works, however setting x.requires_grad_(True) after torch.no_grad() also results in None.

What causes this and how can I solve it? And why is it no problem in training?

Thanks!

Hey, the behaviour is expected.

torch.no_grad() essentially makes autograd (PyTorch’s automatic differentiation engine) to “look away”.

Hence, by using it in your forward pass, you are causing the computation graph to break which is why the gradients are None.

1 Like

Thx for your answer! How can I solve this if I want to keep the instance norm layer in the model?

and why is there no problem computing the gradient with backward() during training?

Why are you using no-grad with instance norm?

You should remove the no_grad context manager for this to work. As @soulitzer pointed out, what’s the reason you are using instance norm with no_grad?