I wanted to try to align the precision of Pytorch inference between the two versions, but I found that adding CUDA paths to PATH and LD_LIBRARY_PATH affected the results of inference. So I did some comparative experiments, and these are my CUDA environment：
NVIDAI Driver: NVIDIA-SMI 460.73.01 Driver Version: 460.73.01 CUDA Version: 11.2
conda env a : torch1.9.0+cu111
conda env b : torch1.13.0+cu117
“SET_CUDA_ENV” means execute following command:
“NO_CUDA_ENV” means do nothing about PATH and LD_LIBRARY_PATH.
[ torch 1.9.0+cu111 & SET_CUDA_ENV ] vs [ torch1.9.0+cu111 & NO_CUDA_ENV ] I got same result
[ torch 1.13.0+cu117 & SET_CUDA_ENV ] vs [ torch1.13.0+cu117 & NO_CUDA_ENV ] I got different results on the first torch.nn.linear layer.
[ torch 1.9.0+cu111 & SET_CUDA_ENV ] vs [ torch1.13.0+cu117 & SET_CUDA_ENV ] I got different results on the first torch.nn.linear layer.
[ torch 1.19.0+cu111 & NO_CUDA_ENV ] vs [ torch1.13.0+cu117 & NO_CUDA_ENV ] I got different results on the first torch.nn.linear layer.
Based on the above experiments, I can’t know which cuda libraries pytorch uses for inference.If pytorch had prioritized /usr/local/cuda11.2 in PATH and LD_LIBRARY_PATH, experiment 3 should have yielded the same result. If cudatoolkit is preferred, then experiment 2 should got the same result. These experiments show that there may be other factors that influence the model’s inference.
I’ve seen these discussions How to check which cuda version my pytorch is using, How does PyTorch pick cuda versions?
Hope to get some explanation, thanks!