I am trying to see if I can use an easy way to freeze the layers in my network. I have an encoder-decoder network with 2 decoder heads (subnetworks). I want to freeze the layers in one of the decoder heads of my network. By virtue of how the chain rule works in backprop, can I just disable the loss of the head I want to freeze or force it to be 0? If this is the case, will the backprop contribution of this subnetwork be 0? My understanding of backprop is that the gradients of each subnetwork will be added at the bottleneck layer.
Otherwise, do I need to set param.requires_grad = False to each layer in the subnet I want to freeze?