Different results in same settings, only different GPU

TaintlessCupcake · December 5, 2023, 11:18am

Hello,

I’m experiencing different losses in the same settings, with only different GPU, 4090 and T4.

I already setup the seed and library version issue. What should I do for same result?

All the settings I set for reproduction is below.

    torch.manual_seed(args.seed)
    torch.cuda.manual_seed(args.seed)
    torch.cuda.manual_seed_all(args.seed)
    random.seed(args.seed)
    np.random.seed(args.seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
    torch.backends.cuda.matmul.allow_tf32 = False
    torch.backends.cudnn.allow_tf32 = False

1st step loss output is
1.4394590854644775 0.6447765231132507 0.0026639089919626713 0.0
and
1.4394253492355347 0.6326223611831665 0.002665554638952017 0.0.

It’s different slightly, but makes significant different for result.

Thanks.

ptrblck · December 5, 2023, 4:29pm

You would have to use the same device as there is no guarantee the same algorithms will be selected between different GPU architectures.