Hello,
I have trained a language model and now I want to fine-tune this pre-trained model.
model.summary()
model.load_encoder('lmtest')
model.freeze()
model.summary()
Before loading the encoder I look at model summary and once again after loading and freezing. However in both summaries trainable modules and number of parameters are still same. I would expect freeze() to set everything other than last layer to non-trainable if I understood correctly. So why does not freeze() change anything (visible)?
I am quite new to pytorch and I would appreciate your guidance.
ecdrid
(Aditya)
May 5, 2020, 10:11am
2
After freeze, try printing,
for name, params in model.named_parameters():
if params.requires_grad:
print(name)
Thanks a lot! I tried it and got the following error:
'RNNLearner' object has no attribute 'named_parameters'
This post says it might be an indentation error but I checked and there was none.
ecdrid
(Aditya)
May 5, 2020, 10:21am
4
What’s RNN Learner
? Also if it’s a model which inherits from nn.module
then the above 3 line will definitely run.
It is from ULMFit architecture and takes Sequential.RNN which inherits nn.Sequential as an input as I understand.
for i,w in enumerate(itos_new):
r = stoi_wgts[w] if w in stoi_wgts else -1
new_w[i] = enc_wgts[r] if r>=0 else wgts_m
if dec_bias is not None: new_b[i] = dec_bias[r] if r>=0 else bias_m
wgts['0.encoder.weight'] = new_w
if '0.encoder_dp.emb.weight' in wgts: wgts['0.encoder_dp.emb.weight'] = new_w.clone()
wgts['1.decoder.weight'] = new_w.clone()
if dec_bias is not None: wgts['1.decoder.bias'] = new_b
return wgts
class RNNLearner(Learner):
"Basic class for a `Learner` in NLP."
def __init__(self, data:DataBunch, model:nn.Module, split_func:OptSplitFunc=None, clip:float=None,
alpha:float=2., beta:float=1., metrics=None, **learn_kwargs):
is_class = (hasattr(data.train_ds, 'y') and (isinstance(data.train_ds.y, CategoryList) or
isinstance(data.train_ds.y, LMLabelList)))
metrics = ifnone(metrics, ([accuracy] if is_class else []))
super().__init__(data, model, metrics=metrics, **learn_kwargs)
self.callbacks.append(RNNTrainer(self, alpha=alpha, beta=beta))
if clip: self.callback_fns.append(partial(GradientClipping, clip=clip))
if split_func: self.split(split_func)
self.output_dp = RNNDropout(output_p)
if bias: self.decoder.bias.data.zero_()
if tie_encoder: self.decoder.weight = tie_encoder.weight
def forward(self, input:Tuple[Tensor,Tensor])->Tuple[Tensor,Tensor,Tensor]:
raw_outputs, outputs = input
output = self.output_dp(outputs[-1])
decoded = self.decoder(output)
return decoded, raw_outputs, outputs
class SequentialRNN(nn.Sequential):
"A sequential module that passes the reset call to its children."
def reset(self):
for c in self.children():
if hasattr(c, 'reset'): c.reset()
def awd_lstm_lm_split(model:nn.Module) -> List[List[nn.Module]]:
"Split a RNN `model` in groups for differential learning rates."
groups = [[rnn, dp] for rnn, dp in zip(model[0].rnns, model[0].hidden_dps)]
return groups + [[model[0].encoder, model[0].encoder_dp, model[1]]]
ecdrid
(Aditya)
May 5, 2020, 10:30am
6
Try this,
for name, params in your_learner_object.model.named_parameters():
if params.requires_grad:
print(name)
Thanks a lot! It works now!
It only prints parameters from the last layer so it means freeze()
works I think
1.layers.0.weight
1.layers.0.bias
1.layers.2.weight
1.layers.2.bias
1.layers.4.weight
1.layers.4.bias
1.layers.6.weight
1.layers.6.bias