There is a self-defined nn.Module with nn.Parameter and Variable as model parameters, such as：
self.weight = nn.Parameter(torch.rand(3,3))
self. fw = Variable(torch.rand(3,3))
def forward(self, x):
--other computions with fw and weight
So what is the difference of Variable and nn.Parameter when the Module backwards?
From the doc-string of nn.Parameter
"A kind of Tensor that is to be considered a module parameter.
Parameters are :class:`~torch.Tensor` subclasses, that have a
very special property when used with :class:`Module` s - when they're
assigned as Module attributes they are automatically added to the list of
its parameters, and will appear e.g. in :meth:`~Module.parameters` iterator.
Assigning a Tensor doesn't have such effect. This is because one might
want to cache some temporary state, like last hidden state of the RNN, in
the model. If there was no such class as :class:`Parameter`, these
temporaries would get registered too.
data (Tensor): parameter tensor.
requires_grad (bool, optional): if the parameter requires gradient. See
:ref:`excluding-subgraphs` for more details. Default: `True`
Variable is deprecated in the latest pytorch, instead tensor is used.
According to my understanding, this means nn.Parameter will be added to a module’s parameters list automatically and Variable will not.
Generall optimizer will compute the gradients of modeule.parameters(). But how does Variable work when backward() is called? This is how a Variable will be optimized in a Module like nn.Parameter?