Adding intermediary layers to a pre-trained (BERT) model

ptrblck · August 10, 2022, 10:00pm

I would not try to monkey-patch the forward method as I assume it can break easily.

Instead, I would load the pre-trained model, make sure all parameters are properly loaded, and then manipulate the model by replacing a pre-trained layer with an nn.Sequential block containing the original pre-trained layer as well as the new one.
E.g. something like this might work:

model = MyModel(pretrained=True)
my_new_layer = NewLayer()

my_original_layer = copy.deepcopy(model.my_layer)
model.my_layer = nn.Sequential(
    my_original_layer,
    my_new_layer
)