My GPU is dead while using Nvidia Apex

I’m not sure, if this is caused by apex or PyTorch, as I’ve seen this behavior using plain PyTorch. If I’m not mistaken, this should be fixed in the latest stable release.

If you didn’t overclocked the GPU, it should be fine. In case your device overheats, e.g. if your GPUs are packed tightly into the case, it should reduce its clock and shutdown as the last step.

It depends on your code and e.g. you might have a data loading bottleneck.
This post explains some workarounds.