I know gradients are used to update weights during next fwd pass. But looking at two examples I have, I am confused at where the gradients are stored during backpropagation in typical training model in pytorch?
From example1 below which sort of simulates very simple backpropagation, this gives me an impression that that tensor a/b should have stored it. I can see after Q.backward is called (I guess that is possible when operands of Q function has gradient), the print(a.grad, g.brad) returns non-null number before that it returned null. So that gives me an impression it is stored in Q’s operands.
A Gentle Introduction to torch.autograd — PyTorch Tutorials 2.3.0+cu121 documentation
import torch
a = torch.tensor([2, 3, 12], dtype=torch.float32, requires_grad=True)
b = torch.tensor([6, 4, 5], dtype=torch.float32, requires_grad=True)
Q = 3*a3 - b2
external_grad = torch.tensor([1, 1, 1], dtype=torch.float32)
Q.backward(gradient=external_grad)
But in example2 here with very simple resnet model with input data, it is less clear where the gradients are stored whereas in example1, I can know where it is stored and can see its value before and after backward() is called:
import torch, torchvision
model = torchvision.models.resnet18(pretrained=True)
data = torch.rand(1, 3, 64, 64)
labels = torch.rand(1, 1000)
prediction = model(data)
loss = (prediction - labels).sum()
loss.backward()
Are gradients stored somewhere in the model along with weights? If so, how to print out?