Requires_grad is not working

Hi everyone. I have created a modular model via this class:

class VGG8StepBySTepModelModular(nn.Module):
    def __init__(self):
        super(VGG8StepBySTepModelModular, self).__init__()
        #empty_model.add_module("stage_1", vgg_8_sub_student)

        #empty_model.add_module("stage_2", nn.Sequential(*list(vgg_8_model.features)[7:]))
        #empty_model.add_module("classifier", nn.Sequential(vgg_8_model.classifier))
        vgg_8_model = get_vgg8()

        self.add_module("stage_1",vgg_8_model.features[:8])
        self.add_module("stage_2",vgg_8_model.features[8:])
        self.add_module("classifier",vgg_8_model.classifier)

    def forward(self, x):
        x = self.stage_1(x)
        x = self.stage_2(x)
        x = x.view(-1,512 * 7 * 7)
        x = self.classifier(x)
        return x

combined_vgg8_student = VGG8StepBySTepModelModular()

Since I have trained a smaller model which contains only the stage_1 of my model, I want to load the trained model for stage_1, and train the rest of the model for the next step. I have loaded the trained stage_1 module successfully. But when I try to set the requires_grad = False for the stage_1 module, it does not working. However, I found that combined_vgg8_student.stage_1.eval() will change the training = False. What is the problem? any help would be appreciated.

 for params in combined_vgg8_student.stage_1.parameters():
        params.requires_grad = False
    
    print("STAGE 1 ",combined_vgg8_student.stage_1.training) # Prints True
    print("STAGE 2 ",combined_vgg8_student.stage_2.training) # Prints True
    print("Classifier ",combined_vgg8_student.classifier.training) # Prints True

    combined_vgg8_student.stage_1.eval()
    print("STAGE 1 ",combined_vgg8_student.stage_1.training) # Prints False
    print("STAGE 2 ",combined_vgg8_student.stage_2.training) # Prints True
    print("Classifier ",combined_vgg8_student.classifier.training) # Prints True


Hi,

The two are not related.
The mod.train() and mod.eval() change the mod.training flag on the module (and its children). This is used to know how modules like batchnorm or dropout should behave.
The requires_grad field on the Parameters tells the autograd if it should track gradients for these Tensors or not.

Hi,

Thank you for your reply. It helps me a lot. So as far as I understood, calling eval() for a part of a model is meaningless. Am I right?
For instance, in my code snippet, combined_vgg8_student.stage_1.eval() does not prevent autograd in the forward loop from computing the gradients.

calling eval() for a part of a model is meaningless. Am I right?

It depends what you want to do.
If you want to prevent gradient computation, then it is not what you want indeed. You need to set the requires_grad field, or combined_vgg8_student.stage_1.requires_grad_(False) that will set the field on all the parameters.

If you wanted to disable dropout and put batchnorm in evaluation mode (that uses the saved statistics), then this was the right thing to do :slight_smile:

1 Like