I found that my model’s training performance depends on the C compiler (whether it is the Intel compiler or GCC).
For my model (It uses Pytorch and Pytorch_geometric), the training loss metric is quite large and saturated early when using the Intel compiler, otherwise the model is trained well (even in different two machines but the same compiler).
The difference is quite large (about 50%, usually not acceptable in the same model and hyperparameters).
I wonder if it is natural behavior or my fault.
My environments are as below.
CUDA 10.2 / 11.6
Intel compiler 19.1
GCC 4.8.5 / 10.2
(Two machines use different CUDA and GCC, otherwise same.)