Your example is correct.
Note that you should wrap this tensor in nn.Parameter
, if you would like to optimize it inside an nn.Module
. nn.Parameters
will be automatically registered inside modules, if you use assign them as attributes:
class MyModule(nn.Module):
def __init__(self):
super(MyModule, self).__init__()
A = torch.empty(5, 7, device='cpu')
self.A = nn.Parameter(A)
def forward(self, x):
return x * self.A
module = MyModule()
print(dict(module.named_parameters()))
> {'A': Parameter containing:
tensor([[-7.8389e-37, 3.0623e-41, -7.8627e-37, 3.0623e-41, 1.1210e-43,
0.0000e+00, 8.9683e-44],
[ 0.0000e+00, -7.8579e-37, 3.0623e-41, 1.4013e-45, 0.0000e+00,
0.0000e+00, 0.0000e+00],
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, -7.7193e-37],
[ 3.0623e-41, 1.8077e-43, 0.0000e+00, 4.7530e-06, 4.5845e-41,
-7.8459e-38, 3.0623e-41],
[ 0.0000e+00, 0.0000e+00, 1.3593e-43, 0.0000e+00, -7.9340e-37,
3.0623e-41, -7.8739e-37]], requires_grad=True)}
By wrapping them in nn.Parameter
, the requires_gradient
attribute will be set to True
by default.
Let me know, if you have more questions or something is unclear.