I’m trying to freeze first part of my network but I’m getting the following error when I try to freeze them:
RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn’t require differentiation use var_no_grad = var.detach().
I have a big model class A, which consists of models B, C, D.
The flow goes B -> C -> D.
So after training for certain number of iterations, in A.forward(), I’d like to freeze B and train only C and D. I’ve tried doing (inside A.forward()):
for param in self.B.parameters():
param.requires_grad = False
However, I’m getting the error message above.
What does the error message exactly mean? My B network contains many convolutions and batchnorms and activation layers in sequential manner. Obviously certain layers come after other layers…
- what does it mean by only being able to change requires_grad of leaf variables? (I mean… leaf variables don’t really matter right? why would they require gradient flow?)
- Also, what does it mean by a computed variable in a subgraph?
- what exactly does it mean by var_no_grad = var.detach()?