requires_grad=False makes sense. Thanks.
So in the forward pass, the input, a
Variable instance, flows through the model. Each layer in the model is really just a single function in a larger, composed function, and this layer creates a new
Variable instance as its output. Each of these newly created variables have a
grad_fn that tell the variable how it was created. When
.backward() is called, each variable can use this cached
.grad_fn to differentiate itself.
If all that is correct, I have a follow up question. Where is the reference to the layer parameters stored? The variables do update the model parameters, not the data but the
.grad field. For example:
>>> params = [p for p in model.parameters()]
# Will be `None`
>>> pred = model(input_)
>>> error = loss(pred, target)
# Will print a tensor of gradients
How does each
Variable know where the parameters are for the layer that created it?