First of, this is my first post here, so if I forget something, please just ask/write me. I hope it’s okay here to post comparisons for TF/PyTorch approaches.
I tested out some simple transfer learning in TensorFlow and PyTorch with the EfficientNet_B0, B5 and the ConvNext (for comparison) on the CIFAR100 dataset with pre-trained imagenet weights and frozen networks. The code and results are here:
PyToch (lightning): Google Colab
TensorFlow: Google Colab
As you can see, the TensorFlow version of EfficientNet gets 70+% accuracy out of the box, while the PyTorch version just gets ~58%. I also tested it with more epochs on another machine, and it didn’t get much better. Compared with that, the ConvNext versions perform similar to each other. I know that one can’t expect completely mirrored results, but 12-13% difference is a lot. Why is that? Did I make a mistake in the processing of the data, or the building of the network? Why does the convnext version runs fine then?
To comment the workflow, I tried to make both workflows as similar as possible, so both approaches share the resize and crop transformation that torchvision returns as the IMAGENET1K transformations. Initially I also had image augmentation, but I discarded it for now for better comparison. I also know that the B5 imagenet has a higher recommened resolution, but training with that runs forever on colab, and it didn’t perform better as I tested it. And the TF version for b5 runs just fine on 224x224.
I also paid attention to the right scale of the data, [0-255] for TF and [0-1] for PyTorch. The use of PyTorch Lighning should not make a difference here, right?
As this is my first step into transfer learning, I’d be grad to know any mistakes I made, as to avoid them in the future. Thanks in advance!