PyTorch Alexnet not as good as Original Alexnet implementation

The torchvision implementation of Alexnet is based on the Alexnet One weird trick paper and the number of parameters is reduced. The problem is that this network is not as good as the original implementation.

Where can I find the original implementation in PyTorch? I have looked into the cnn-benchmark repo but the model converted is not as good as the torch implementation.
Specifically, I am not able to attain the baseline performance of ~61% on domain adaptation task on Office-31 Amazon to Webcam (which is attainable by Alexnet in other frameworks, like Torch and reported in papers). The maximum I have been able to go with porting weights from the above mentioned repo and converting to PyTorch using this repo is ~55%.
A difference of ~1-2% in performance is acceptable, but this ~5% gap is too wide. Any help with this problem, or finding better Alexnet implementation in PyTorch would be great!

Note: I have ran multiple experiments for different learning rates, batch sizes, number of epochs, freezing starting layers of the network, different learning rate decay schemes, learning rate schedulers. Still the model kind of ‘saturates’ to ~55-57%. Also, when porting weights to PyTorch, I used my own implementation of LocalResponseNormalization layer, which was required in the original implementation.

1 Like

Hi, you can check this repository on GitHub:

I have used it for my own task and it has better performance than the torchvision implementation.