Will PyTorch give different results from Tensorflow?

rostom007 · May 3, 2020, 5:59pm

would like to reimplement a movie recommender system created in tensorflow with Pytorch.

My question is that does this modification affects the recommandation results?

KFrank · May 3, 2020, 7:43pm

Hello Rostom!

Yes and no …

If you are careful, you should be able to reimplement a tensorflow
network in pytorch and get the same* results.

*) For some definition of “same” …

What could cause the results to differ?

Your two networks aren’t actually the same. (You weren’t careful
enough.)

The pytorch module / class / function that is supposed to match
one in tensorflow doesn’t actually do the same thing. This could
be an error in pytorch or tensorflow, or could be a legitimate
difference of opinion. If this happens you might have to find a
work-around, or reimplement tensorflow’s specific logic with your
own pytorch code to get it to match.

Pseudo-random numbers – used for example to initialize weights
and for things like RandomSampler – don’t match between
tensorflow and pytorch (and, no, they don’t match). In this case,
the results of your two networks will differ, but they should agree
statistically. That is, if you rerun your training from scratch several
times (using different random seeds each time), the distribution
of results you get from pytorch should match that from tensorflow.
You could avoid this by not using tensorflow’s and pytorch’s
built-in random functions. For example, you can initialize random
weights and randomly sample data “by hand,” using your own same
pseudo-random number generator for both.

The floating-point round-off differs in the two cases, because the
order of (otherwise mathematically equivalent) operations differs.
In this case your results should start out almost the same, only
differing within floating-point round-off, and then drift apart as your
training progresses. Your results should still be statistically the
same, but, in any individual run, could differ by a lot. There is
really no way to avoid this effect.

As a concrete example, I implemented the same simple digit
classification network in both tensorflow and pytorch. I had to
dig around a little to figure out the exact logic that the two
systems were using for their random weight initialization, and
then tweak things a little to get them to match (statistically).
After doing this, I was able to verify that I was getting statistically
equivalent results. (I did not try to use my own pseudo-random
number generator to get essentially identical results.)

Best.

K. Frank

rostom007 · May 3, 2020, 8:20pm

thnx for your reply.one more question , did u mean by (statistically) the same RMSE or MAE? this is exactly what i want to know.