The simple code below yields different results on different GPUs (e.g., 1070 got 998874 but 1080Ti got 998836). I wonder if I did something wrong or it is just impossible to get the same result on different GPUs?
import torch
import numpy as np
import random
seed = 0
torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = True
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed) # if you are using multi-GPU.
np.random.seed(seed) # Numpy module.
random.seed(seed) # Python random module.
a = torch.ones(1000,1000).to('cuda:0')
dropout = torch.nn.Dropout(0.5).cuda()
b = dropout(a)
print(torch.sum(torch.abs(b)))
Are you using the same PyTorch version (CUDA, cudnn)?
Getting the same “random” numbers on different hardware is sometimes quite hard.
However, using your code, I get the same result (tensor(1000260., device='cuda:0')) for:
Yes, the same environment. I don’t have a TitanV to try but I guess it is quite similar to V100 so they could yield the same result.
More tests:
My local server (PyTorch 1.20, CUDA 10.0.130, CUDNN7602, 2080Ti) got 998908.
Instances on Google Cloud using the official pytorch 1.20 image (exactly same versions as above): got 1000260 on V100 but 999100 on K80.