Even after freezing parameters, accuracy is changing

if epoch >=10:
model.train(False)
for param in model.parameters():
print(param)
param.requires_grad = False

I have freeze the network after some epoch but forward pass results after that epoch are still changing (training and validation accuracy is changing).

Please help me how to overcome.

Thanks in advance.

Could you post the model architecture, so that we can have a look?

It is a simple binary classifier:

(classifier): SimpleClassifier(
(main): Sequential(
(0): Linear(in_features=1024, out_features=512, bias=True)
(1): ReLU()
(2): Dropout(p=0.5, inplace=True)
(3): Linear(in_features=512, out_features=2, bias=True)
)
)

Fully connected 1024 --> 512
ReLU
Dropout
512 --> 2

Thanks for response.

Thanks for the code snippet.
The model correctly reproduces the same results in eval mode using a static output:

model = nn.Sequential(
    nn.Linear(1024, 512),
    nn.ReLU(),
    nn.Dropout(0.5, inplace=True),
    nn.Linear(512, 2)
)

model.train(False)
for param in model.parameters():
    param.requires_grad = False
    
    
x = torch.randn(1, 1024)
out_reference = model(x)

for _ in range(100):
    out = model(x)
    print((out-out_reference).abs().max())

Are you using any other (random) operations?

No, other function I am using.
This is exactly the same architecture of my model.

Thanks for reply.

Does my code snippet reproduce different results using your setup?

It worked.
Thanks for your help.

Hi, I don’t understand how the problem is solved. Based on the code he gave at first, he wanted the training to stop after a certain epoch. But in the code you gave him, you were initializing the model and setting train to false immediately without training it first.
Thanks for your answer.

Based on the initial question, @Kim_KA was wondering, why his outputs change even after freezing all parameters and setting the model to eval().
Since his model architecture was not responsible for these changes, something else might have created the randomness, but that’s on @Kim_KA to report. :wink:

From the perspective of his code snippet and model architecture, the outputs are deterministic and static.

1 Like

Thank you @ptrblck :slightly_smiling_face: