Hi.I am having a CNN with two CE losses (due to two separate datasets to be trainined simultaneously), each with an individual FC layer prior to the last CE entropy loss estimation.
What should I do if I want to freeze one of these losses, i.e. training only either one without removing any of them?
I have set the requires_grad of one end to be False.
Should I set the corresponding layers to the evaluation mode? and what about the gradient calculated for the one to be freezed, simply set it to be zero?
It depends on your current workflow.
I.e. are you using both linear layers, even though are are currently only using a single Dataset?
If so, then setting requires_grad=False for the linear layer, which should not be used, should be enough.
Calling eval() on it shouldn’t be necessary, if you are only using a linear layer, but won’t hurt.
To train the new_logits only (while freezing FC-Dataset-1), I simply set the .requires_grad for FC-Dataset-1 to zero as follows.
for name, param in resnet.named_parameters():
if name in ['logits.weight', 'logits.bias']:
param.requires_grad = False
else:
param.requires_grad = True
And during training, I neglected the output for FC-Dataset-1 as follows.
I assume you’ve written a custom resnet model, as you are expecting two outputs?
Note that your current approach will not switch between the linear layers, but call new_logits on top of resnet.logits. Also, y_pred1 isn’t defined in your code.
If you haven’t defined a custom resnet but are using the torchvision implementation, note that this model does not contain a resnet.logits layer and you are simply assigning a new linear layer to this attribute.
Hmmmm actually I am pre-training the net using Dataset2 , and the entire net will be activated to train both losses with the two datasets simultaneously after this.