I generally define my networks in the __init__
function using torch.nn
layers. However, if I want me convolution to be dependent on the size of the input, it might make more sense to use the torch.nn.Functional
version. Are those parameters learnable too? If so, how does that effect the state graph, particularly when loading and saving models?
When using torch.nn.Linear
for example, the nn.Linear
class looks after initialising and using the parameters that it requires.
When using the torch.nn.Functional.linear
variant, it is up to you to provide the parameters on each forward pass.
Basically instead of
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.linear = nn.Linear(in_features, out_features)
def forward(self, input):
return self.linear(input)
You would do this
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.weight = nn.Parameter(torch.randn(out_features, in_features))
self.bias = nn.Parameter(torch.randn(out_features))
def forward(self, input):
return F.linear(input, weight, bias)
In the first case Model knows that it has an nn.Linear
submodule, in the second case Model knows that it has two parameter tensors.
So in the first case Model.parameters()
will list the weight and bias parameters of the nn.Linear
submodule, in the second case Model.parameters()
will list the weight and bias parameters defined in init.
Training, saving and loading can all be done in exactly the same way in both cases.
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
def forward(self, input):
return F.linear(input, torch.randn(out_features, in_features))
In this condition, Model.parameter has no parameter, since
m = Model ()
len(list(m.parameters())) # returns 0
So, in this condition, is the parameter tensor the size of [in_features, out_features] in linear function learnable?
In BackPropagation , can i use SGD to update parameters in linear funtion?
No man. In the way you do it here a random tensor is indeed initialized, but it’s not part of the learnable set of parameters for your model because of the way you initialized it. See Are torch.nn.Functional layers learnable? for clarification