I try to set some layer freeze in my network, when I freeze it and run it on a single GPU it will work fine. However, when I try to run it on multiple GPU I will get " get RuntimeError: you can only change requires_grad flags of leaf variables.".
For me it sounds (I have never seen that error), that you are freezing layer after calling dataparallel. I would say that replicated models are pointing to original model and that’s why they are no longer leaf variables. Thus, when you try to freeze the model it just throws that error.
Could you try to freeze weights before calling dataparallel?