Asrock z390 extreme4
x2 2080 ti x2
Cooler Master v1200 Platinum
pytorch was installed according to guide on pytorch.org
So I’ve got something interesting: pc crashes right after I try running imagenet script for multi gpu from official pytorch repository. It doesn’t crash pc if I start training with apex mixed precision. Training on a single 2080 also didn’t cause reboot.
What didn’t work:
- decreasing batch size
- limiting power consumption of gpu’s via nvidia-smi
- changing motherboard, cpu, power supply
- changing 2080 ti vendor
For some reason everything worked after I switched both 2080 ti’s with 1080 ti’s. So it seems pytorch (or some nvidia software) isn’t fully compatible with multiple 2080 ti’s? Has anyone encountered this?