Hi,
I have a pretrained model which has a linear layer. It had parameter bias=False while training. I want to make bias=True in the same model, as it has some dependencies while converting model to onnx. Is there any way to do it, so that performance stays the same.
Hi Sourabh!
The simplest way to do this will be to replace the Linear
in question with
a new Linear
with bias = True
and then initialize the new Linear
’s weight
(and bias
) with the values from the old Linear
.
You need to get your hands on the Linear
in question somehow. It will typically
be a property of the model or some sub-model or an element of some collection.
I will illustrate this with an example where it is a element of a Sequential
.
>>> import torch
>>> print (torch.__version__)
2.1.0
>>>
>>> _ = torch.manual_seed (2023)
>>>
>>> seq = torch.nn.Sequential (torch.nn.Linear (3, 2, bias = False), torch.nn.Tanh(), torch.nn.Linear (5, 1))
>>> seq[0]
Linear(in_features=3, out_features=2, bias=False)
>>> seq[0].weight
Parameter containing:
tensor([[-0.0820, 0.2541, 0.5175],
[-0.0235, 0.0478, 0.5665]], requires_grad=True)
>>> oldLin = seq[0]
>>> seq[0] = torch.nn.Linear (3, 2, bias = True)
>>>
>>> with torch.no_grad():
... _ = seq[0].weight.copy_ (oldLin.weight) # copy in pre-trained weight into new layer
... _ = seq[0].bias.zero_() # zero out new bias for consistency with pre-trained model
...
>>> seq
Sequential(
(0): Linear(in_features=3, out_features=2, bias=True)
(1): Tanh()
(2): Linear(in_features=5, out_features=1, bias=True)
)
>>> seq[0]
Linear(in_features=3, out_features=2, bias=True)
>>> seq[0].weight
Parameter containing:
tensor([[-0.0820, 0.2541, 0.5175],
[-0.0235, 0.0478, 0.5665]], requires_grad=True)
>>> seq[0].bias
Parameter containing:
tensor([0., 0.], requires_grad=True)
Best.
K. Frank
Hi,
Thanks for the solution. I also tried out something similar which worked for me.
import copy
####### model without bias i.e. bias=False
model_with_bias = copy.deepcopy(model)
to check which linear layer we want to modify
for name, param in model_with_bias.named_parameters():
print(name)
layer_name = ‘linear’
####### get linear layer weights only
weights = getattr(model_with_bias, layer_name).weight
print(weights.shape)
in_features = weights.shape[1]
####### create a new linear layer with bias = True
new_linear_layer = torch.nn.Linear(in_features, in_features, bias=True)
######## clone the weights to new linear layer’s weights
new_linear_layer.weight.data = weights.clone()
######## make all bias terms as zeros, as they have some initialization
new_linear_layer.bias.data.fill_(0)
######## replace old linear layer with bias=False with new linear layer with bias=True
setattr(model_with_bias, layer_name, new_linear_layer)
Regards,
Sourabh
Hi Sourabh!
A quick caution: .data
is deprecated and can lead to errors. (I do believe that
your usage of .data
will work the way you want with the current version of
pytorch, but why ask for trouble?)
The preferred way to do this is to mutate the tensors (i.e., use functions that
modify the tensors in place) under the protection of a with torch.no_grad():
block (as illustrated in the example I posted).
Best.
K. Frank
Hi,
Thanks you so much for the feedback. Will keep your inputs noted.
Regards,
Sourabh