RuntimeError: CUDA error: out of memory; Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions

Trouble in training Yolov5

It worked well when the version of pytorch is 1.7, but when I upgrade pytorch to 2.0, alwanys shows ‘CUDA out of memory’. My GPUs are 3090.
And I have added

export CUDA_LAUNCH_BLOCKING=1
export TORCH_USE_CUDA_DSA=1

as the Error suggestions.
Now it shows

RuntimeError: CUDA error: out of memory; Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

What should I do to solve this problem?

torch.cuda.is_available = True

Could you post a script (e.g., with random/dummy data) that reproduces the issue/high memory usage that worked in 1.7 but crashes in 2.0?

I’m sorry for bothering you. The issue disappeared on the second day, so I forgot my question. It may because of the unkilled processes, and coincidently I upgraded pytorch, so I ascribe the error to pytorch…

Heyy, I’m not getting the CUDA:Out of memory error but the later half i.e. Compile with ‘TORCH_USE…’. I don’t have any active processes too. Can you tell which PyTorch version have you upgraded too?

Any update on this. I am facing this issues when running a parellel programming code on a server.
Tried with Pytorch 2.0 and Cuda 11.8, Also tried with Pytorch 1.13 and cuda 11.7. but no luck any thoughts?