Model shows different predictions after training without weight update

densenet121 uses batchnorm layers, which will update their running estimates during training in each forward pass.
During evaluation these running estimates will then be applied instead of the batch statistics, which explains the difference in your outputs.

1 Like