Hi
I am running a 3080 GPU on Ubuntu 20.04 LTS with NVIDIA driver 460.39.
For some reason, my training program would randomly receive segmentation fault.
I did multiple tries but could not pinpoint the cause of this issue.
At first I was using nightly version but later switched to 1.8.0 (using CUDA11.1 installed with conda) and the issue persists.
I tested the code on 1080Ti, 2080Ti and RTX Titan. All of them run perfectly fine except the 3080.
Here is a screenshot of the gdb result:
It also sometimes gives “illegal instruction” errors.
Could anyone help me with this?
Thanks.