Hello I am still confuse with the mechanism in pytorch 1.0 .
- How to do weight freeze? some people give examples like this.
for param in model.parameters(): param.requires_grad = False
Then if all the parameter weight is set
requires_grad=False, the what happen if we input tensor
requires_grad = True, or vice versa?
Is there any different with
I am doing a crazy things where I have a neural network model let we said A and B. A weight is trainable but B weight is freeze. Automatically
B.weightis set to
requires_grad=False. Now in my Network the process is the network A will process the input, in the middle of it, the intermediate A result is feed to B. After B has an output, that output is combined with intermediate A, then A process the combined feature then finally giving the final Network result. I have done this and the result is the loss becomes Nan. I think it is because autograd failed to track since we have a combined result of freeze and unfreeze weight. How to do it properly?
Like in my third question, what happen if we have model with requires_grad parameter is False (like B parts ) if we have this statement?
with torch.set_grad_enabled(True): model(inputs)
- I am still confuse what is the different of
model.eval()? Do I need to do both model.train(False) and
model.eval()for every validation and test step?
Simply I have a model which deploy dropout layers. I am doing like this:
In training phase ->
In validation phase ->
In testing phase ->
However I found that my model is not working properly. I must remove
model.eval()to get the best result. Later I tried in validation phase ->
model.eval(). However again the result is not good I must remove
model.eval()in testing phase. Could anyone explain this phenomena?, what should I do in validation and testing phase?, is it enough if we only use
model.train(False)? How about if the tensor is