Rerun your code with CUDA_LAUNCH_BLOCKING=1
to isolate the failing operation, then check why it’s failing in the same way as described in this topic which was running into the same issue.
If you get stuck post a minimal and executable code snippet to reproduce the issue as was also done in the other thread.