Hi, I encountered a strange problem: when I set model.eval() in evaluation stage and extractor bottleneck feature from audio. The extracted embedding are all [Nan], but when I set model. train(). The embeddings are normal numbers. If I set model.train() and model.param.required_grad =False, are the inference results accuracy? Thanks very much for your help
Could you please post an executable snippet that’d reproduce this error?
Certain layers like batchnorm etc. work differently in train mode, so it’s best to use eval for inferencing.
It depends on what layers you have in your model. Typically:
If you have any dropout layers in your model, setting model.eval() will disable the masking.
If you have any batch norm layers, your model will use the running stats instead of statistics from the current input.
You most likely want to debug why eval mode is failing for your model, so I’d second srishti’s request for a self-contained code snippet to reproduce the issue.
Tanks for your reply. I found the problem occurs in BacthNorm1d line. ` self.conv = Conv1d(
in_channels=in_channels,
out_channels=out_channels,
kernel_size=kernel_size,
dilation=dilation,
)
self.activation = activation()
self.norm = BatchNorm1d(input_size=out_channels)
def forward(self, x):
pdb.set_trace()
for name, param in self.norm.named_parameters():
print("name----->",name)
print("params---->",param)
pdb.set_trace()
return self.norm(self.activation(self.conv(x)))
text`
The outputs of self.conv(x) and self.activation() are ok. The params ‘weights’ and ‘bias’ in self.norm is ok, but any tensor go through the self.norm(x) will become Nan. When I set track_running_stats=False, the self.norm(x) go well.
I have another questions, If I set the track_running_stats of torch.nn.BatchNorm1d() to False in the training phase model.train() and the test phase model.eval(), will it reduce the system performance?
During eval the accuracy of your model can suffer if for example during evaluation you are passing in input with batch size of 1 because the activations are no longer normalized. It may not matter if you are using large batch sizes during evaluation.
Thank you. I got it.