Is there a way to check if my cudnn/cuda is consistent among different machines?

seyeeet · February 14, 2022, 7:02pm

I have been able to fix the seed and make sure that the generate weights and biases and data in the dataloader are the same cross two different machines/hardware.

random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = True

but the final results are still different, and I think it comes from optimization part and cuda and cudnn (please let me know if I am wrong). BAsed on my observation, the weights and dataloader and initializations are all consistent.

I was wondering if there is a way to also fix that part or even if there is way to check that?

ptrblck · February 15, 2022, 2:13am

You can check the versions via:

print(torch.version.cuda)
print(torch.backends.cudnn.version())

Note however, that there is no guarantee to get the same bitwise identical results if different hardware is used.