Splitting Pre-Trained Model by its Parameters

uertenli · December 14, 2019, 12:47pm

Hi,

I want to use a pretrained DenseNet-121 model which I load directly from Torchvision. However, I want to be able to split the model from one of the transition layers, say, transition2, and use the values up to that point.

There has been some discussion going on: How to load part of pre trained model?, but it got me confused.

How can I split the parameters until a certain point. to clarify myself further, I want to have the values of the keys in such a way:

for key,value in densenet.state_dict().items():
   print(key)
   if key == "features.transition2.conv.weight":
		break

Kushaj · December 14, 2019, 2:21pm

Say you have defined a model that follows the same structure as densenet121 till the transition2. Now you want to load the original weights of densenet in your new model. Every layer in your model will have a name and corresponding weights. All the weights of your model are stored as a dictionary, where the key is the layer name and value is the weights of that layer.

To get a part of the model, you can thus only select those (key, value) pairs that you want which in your case are till transition2.

uertenli · December 14, 2019, 8:56pm

Is there a way to get around this without having to write all these layers manually?

Thanks!

Kushaj · December 15, 2019, 3:21pm

You can select the children layers as required. You can select the first 10 layers as

new_model = nn.Sequential(*(list(model.children())[:10]))

For more complicated models, you will have to write source code again. But the above method will work in your case.

uertenli · December 15, 2019, 3:31pm

Because of how the pretrained DenseNet model is defined, it has only two children. The entire model up to classifier and the classifier. It seems as though I will have to write the source code myself.

Thanks.