Hi. I did a study on private data with pytorch. but I didn’t get a good result and my training period was too long. I got a very very short training time and a very high accuracy rate when I used tensorflow to work the same, what could be the reason. note: the same model and the same optimization algorithm in both
Typically, people with this problem don’t actually have as much “the same model and the same optimization algorithm” as they think. That is because if you had, you would get comparable results.
So how do we know if it’s exactly the same? I wrote the code for both of them myself. everything looks the same
What I do is that I check that same input produces same output, then same grads then same optimizer step (with “same” up to numerical accuracy). Many operations have subtle difference between different frameworks, e.g. whether a “mean” is taken or a “sum” or similar.
Can you take a look if I send you the codes?
There are differences in how some modules are implementing in TensorFlow vs PyTorch.
i.e. batch norm momentum, .1 in PyTorch works same as .9 in TensorFlow.
Paddings are slightly different too.
I’m fully booked this year, sorry.
thank you again for everything
I agree with Tom that you cannot just think your code look that same and the results have to be the same.
If you want to make sure everything is the same, layer by layer, you need to have testing for that, from layer to sequential of layers between TensorFlow and Pytorch.