Segfault when running on 3080

Hi

I am running a 3080 GPU on Ubuntu 20.04 LTS with NVIDIA driver 460.39.
For some reason, my training program would randomly receive segmentation fault.
I did multiple tries but could not pinpoint the cause of this issue.
At first I was using nightly version but later switched to 1.8.0 (using CUDA11.1 installed with conda) and the issue persists.
I tested the code on 1080Ti, 2080Ti and RTX Titan. All of them run perfectly fine except the 3080.

Here is a screenshot of the gdb result:

It also sometimes gives “illegal instruction” errors.

Could anyone help me with this?
Thanks.

Do you get a stack trace for the illegal instruction?
Could you also post an executable code snippet to reproduce this issue using random inputs, so that we could debug it?

Unfortunately I did not have the stack trace for the illegal instruction error.
This error is a bit random.
I will keep this post updated when I find a way to reproduce it consistently.
Thanks.