Extract several intermediate layers of pytorch-transformer models to form a new model

I am using the huggingface pytorch-transformers for knowledge distilation,
How can I, for example, extract 8 layers from the 12 BertLayers of the bert-base-uncased to form a new model? I want to use the embedding and pooler layer of orginal model, but use only a portion of the encoder layers.

Depending on the original model architecture, you should be able to access the internal modules directly e.g. via:

emb = model.encoder.embedding
fc = model.fc

These attribute names are just randomly picked by me and you would have to look into the model architecture to get the corresponding names.

Once you have all modules, you could pass these to your custom model and just reuse them in the forward method. :slight_smile: