RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach()

Han_Brian_Lee · March 13, 2020, 11:45pm

I’m trying to freeze first part of my network but I’m getting the following error when I try to freeze them:

RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn’t require differentiation use var_no_grad = var.detach().

I have a big model class A, which consists of models B, C, D.

The flow goes B -> C -> D.

So after training for certain number of iterations, in A.forward(), I’d like to freeze B and train only C and D. I’ve tried doing (inside A.forward()):

for param in self.B.parameters():
  param.requires_grad = False

However, I’m getting the error message above.

What does the error message exactly mean? My B network contains many convolutions and batchnorms and activation layers in sequential manner. Obviously certain layers come after other layers…

what does it mean by only being able to change requires_grad of leaf variables? (I mean… leaf variables don’t really matter right? why would they require gradient flow?)
Also, what does it mean by a computed variable in a subgraph?
what exactly does it mean by var_no_grad = var.detach()?

Han_Brian_Lee · March 16, 2020, 4:24pm

Note that I tried building some simple networks to reproduce the problem and I couldn’t… and I can’t share the whole codebase because it is gigantic and my company’s IP.

I’m trying to get some ideas as to what that error points to exactly…