How are you freezing the layers? Freezing is the only way in which you can exclude parameters during training. In your example I see that you have defined your optimizer as checking out all params. While freezing, this is the way to set up your optimizer:
The filter doesn’t offer so much of a change in a simple optimizer with a learning rate but since you are using momentum and weight decay, the params whose requires_grad is set to False will be updating on the basis of these.
You freeze the parameters manually before training. For example, if i want to freeze the first 3 layers of a Resnet encoder, I use–
for i,child in enumerate(model.children()):
for k,child_0 in enumerate(child.children()):
if k<=3:
for params in child_0.parameters():
params.requires_grad=False
print("Frozen {} layer {}".format(k,child_0))
break
This is a small snippet which i built based on my encoder. You have to come up with your own freezing loop to freeze custom layers. Note that all this happens before training begins.
A very good idea would be to put it just after you have defined the model. After this, you define the optimizer as