I am using a 12 GB GPU (Titan Xp) and I was surprised to notice that running two experiments (each takes around 4GB of memory; at different terminals) of the same PyTorch program need twice as much time as each separately.
I’m not sure if I miss something here, or I need to adjust some parameters to improve speed performance. Or, so be the situation in this case!
You can check with nvidia-smi but most network are not limited by memory but how much the GPU can compute. It is the field called GPU-Util in nvidia-smi.
If one job already uses 100% (or close), then the GPU is fully used and so the jobs will have to run one after the other, leading to the slower runtime.