Python kernel crashes with GPU

Hello! i have an alienware 18r2 with a nvidia 4080 i use nvidia 556.12 and i tried the cuda 12.4 with pytorch to train a LTSM model, but the kernel keep crashing after a few minutes, does someone had a similar issue using GPU for machine learning? the only way to train the model is by using ONLY the CPU but thats not the idea.

Check if dmesg is showing and Xids which could correlate to the crash. If not, run your script in a terminal and check if valid stacktraces could point to a problem.