Huge difference in precision between pytorch and tensorflow

unemployed-denizen · June 9, 2022, 5:12pm

I’m feeding a data into a single convolution layer from Pytorch and TensorFlow. I could make sure that:

the input are the same, they are initialized randomly
the kernel weights and bias are same, they are initialized randomly
the way of padding is forced to be the same

However, the result shows that there been huge precision differences between the two frameworks. The difference is bigger than 1e-3.

I have attached more details in this notebook(click here). This problem is more vital when I tried increasing the input and output channels. While small channel numbers seems to be fine.

I would be really appreciate it someone could tell me what goes wrong with that. Thanks in advance.

Framework Version:

Pytorch 1.10.0a0+git593e8f4
TensorFlow 2.5.0 (but it’s tensorflow.compat.v1 in the notebook)

ptrblck · June 9, 2022, 6:50pm

This value doesn’t say much without seeing the ranges or knowing if it’s a relative or absolute error.
Based on your code snippet it seems you are calculating the absolute error in an output range of ~1000 which would be a relative error of ~1e-6 and expected for float32.

unemployed-denizen · June 10, 2022, 8:57am

I agree that these errors do not matter, since I have managed to successfully transferred model weights from TensorFlow to Pytorch and test the results today. I was frustrated at first, and thank you for pointing out the relative error.