Hello,
I’m trying to achieve 2 things:
- enlarge a linear layer’s weights.
- freeze some of it’s units from gradient propagation.
This is what i’ve written:
1.
Where new_w
is a Tensor, new_w.shape >> model.output_layer.weight
and model.output_layer
is a Linear layer.
model.output_layer.weight.data = new_w
if i print(model.output_layer.weight.data.shape)
it does show the new shape, however
if i print(model.output_layer)
it shows me the old arguments to the Linear layer.
From what i’ve experimented with, i don’t think PyTorch allows you to freeze units rather than just whole layers.
for p in copied.output_layer.parameters():
p[0,0].requires_grad = False
p[old_w.shape[0]-1,old_w.shape[1]-1].requires_grad = True
print(p[0,0],p[0,0].requires_grad)
print(p[old_w.shape[0]-1,old_w.shape[1]-1],p[old_w.shape[0]-1,old_w.shape[1]-1].requires_grad)
break # only the weight, ignore the bias for now
This will basically output False
for both.
Any ideas, suggestions? thank you!