I have a pre-trained model called SoundNet, and the weights are available in TensorFlow. I created the model and I loaded the weights. The implementation is available here .
The preprocessing and many parameters were taken from the Tensorflow repository
The output of the first convolutional layer is exactly the same as the one in Tensorflow, however, the following BatchNorm2d layer yields different numbers.
There is an ambiguity in the way we should use the mean and the variance of Batchnorm layer in Pytorch. I am doubting about the way I load the weights of BatchNorm in Pytorch and the way I use it:
I calculated the BatchNorm2d manually by Numpy Broadcasting as defined in PyTorch reference. I got similar values to the ones in Tensorflow implementation, which confirms my doubt about the way I use BatchNorm2d in my implementation.
Any idea how to use these parameters from a pre-trained model for feature extraction?
After spending a few hours on this, as a beginner in PyTorch, I will put the answer, perhaps it will be useful for some people out there:
For mean and variance we should use running_mean.data and running_variance.data when assigning the pre-trained weights. This is strange since for the other variables, assigning the weights directly without explicit .data
It is safer to use the variable.data to assign weights in general. The function to load weights would look like this: