RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach()

I’m trying to freeze first part of my network but I’m getting the following error when I try to freeze them:

RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn’t require differentiation use var_no_grad = var.detach().

I have a big model class A, which consists of models B, C, D.

The flow goes B -> C -> D.

So after training for certain number of iterations, in A.forward(), I’d like to freeze B and train only C and D. I’ve tried doing (inside A.forward()):

for param in self.B.parameters():
  param.requires_grad = False

However, I’m getting the error message above.

What does the error message exactly mean? My B network contains many convolutions and batchnorms and activation layers in sequential manner. Obviously certain layers come after other layers…

  1. what does it mean by only being able to change requires_grad of leaf variables? (I mean… leaf variables don’t really matter right? why would they require gradient flow?)
  2. Also, what does it mean by a computed variable in a subgraph?
  3. what exactly does it mean by var_no_grad = var.detach()?

Note that I tried building some simple networks to reproduce the problem and I couldn’t… and I can’t share the whole codebase because it is gigantic and my company’s IP.

I’m trying to get some ideas as to what that error points to exactly…