Use certain layers of a custom model

TeDataPro · April 20, 2022, 1:42pm

Hello !

My situation : From a personal NLP model used for text classification with BERT, which was already pre-trained on a corpus (by myself). I want to remove the last layers (classification layers) and add new final ones for an other task. This to use the retained layers to create this other model, for a similar task, so as not to re-train everything.

From this topic, I’ve been able to extract layers which interest me (These layers are BERT Embedding) : How to delete layer in pretrained model?

Here’s my code :

import torch.nn as nn

# Remove classification layer from my model
sub_model = nn.Sequential(*list(model.camembert.children())[:-1])

My problem : How do I use my new object sub_model to re-build the same architecture of model with a different final layer ? Since sub_model is now a nn.Sequential object and model.camembert was a camemBERT transformer.

I tried to litteraly replace model.camembert by sub_model but it doesn’t work. Here’s the error :

# Original model : works fine
logits = model.camembert(input_ids, attention_mask=attention_mask, labels=None, output_hidden_states=False).logits

# New model : doesn't work
import torch.nn as nn
sub_model = nn.Sequential(*list(model.camembert.children())[:-1])
logits = sub_model(input_ids, attention_mask=attention_mask, labels=None, output_hidden_states=False).logits

The error is :

TypeError: forward() got an unexpected keyword argument 'attention_mask'

I’m pretty new to PyTorch so thank you very much for your help ! Have a nice day !

JuanFMontesinos · April 20, 2022, 8:12pm

I don’t understand at all.
The sequential network only digest 1 argument and it basically created like a chain.
x->sub1->sub2->sub3
so the output of 1 goes to 2 and so on.
Either you recode evberything to work with dictionaries or you rewrite the nn.Sequential as a custom nn.Module