CUDA error: unspecified launch failure

Thank you! Yes, the Xid 79 is helpful as it explains that the “GPU has fallen off the bus”, which can be a HW error, driver error, system memory corruption or thermal issue.
I think a few weeks ago I’ve seen the same issue where the power plug wasn’t fully connected to the GPU and caused the same issue.
Given that, Xid 79 should not be raised by user code, so I don’t believe your PyTorch code (or any library) is the root cause.