Different speedups with multi GPU training on two machines with same code

Hi All,

I am trying to run one code on multi-GPU setting using DataParalllel. I have access to two machines having 2 GPUs each.

Machine 1: Two 1080Ti
Machine 2: Two 2080Ti

I am running the same code with same dataset in multi GPU setting on each machine separately. Code is working fine on Machine 2 and giving the expected speedup. However on Machine 1 code starts fine but after few training steps its speed drops drastically. On both the machines I am working with Pytorch Version 1.1.0.

Are these machines identical besides using different GPUs?
I.e. are they using approx. the same SSDs, CPUs etc?

Also, are you using the same CUDA and cudnn versions?

The difference might not directly come from the GPUs but possibly other bottlenecks, e.g. data loading and processing.