PyTorch `torch.no_grad` vs `torch.inference_mode`

timgianitsos · October 13, 2021, 6:33am

PyTorch has new functionality torch.inference_mode as of v1.9 which is “analogous to torch.no_grad… Code run under this mode gets better performance by disabling view tracking and version counter bumps.”

If I am just evaluating my model at test time (i.e. not training), is there any situation where torch.no_grad is preferable to torch.inference_mode? I plan to replace every instance of the former with the latter, and I expect to use runtime errors as a guardrail (i.e. I trust that any issue would reveal itself as a runtime error, and if it doesn’t surface as a runtime error then I assume it is indeed preferable to use torch.inference_mode).

More details on why inference mode was developed are mentioned in the PyTorch Developer Podcast.

I asked a similar question elsewhere but it was the wrong forum.

ptrblck · October 13, 2021, 8:02am

Yes, you can depend on runtime errors and as long as no errors are raised, you code should be fine.
One difference would be that you are not allowed to set the requires_grad attribute on tensors from an inference_mode context:

with torch.no_grad():
    x = torch.randn(1)
    y = x + 1

y.requires_grad = True
z = y + 1
print(z.grad_fn)
> <AddBackward0 object at 0x7fe9c6eafdf0>

with torch.inference_mode():
    x = torch.randn(1)
    y = x + 1

y.requires_grad = True
> RuntimeError: Setting requires_grad=True on inference tensor outside InferenceMode is not allowed.

Sakshi_Bhatia · August 5, 2025, 8:06pm

Woah that’s interesting! So if I have a dummy tensor to ensure compatible dimensions inside my model tensor, I should preferably set that dummy to .inference_mode() since I do not need to take gradients on that. Am I right?