During backward() | CUDA error: an illegal memory access was encountered

Hello,
Thank you.
I run my code via your suggestion.
The error messenge is as follows. (Error: No attachable process found.)
BTW, I am training a neural network (CNN). If I set initial_channel=32, this issue won’t arise.
However, if the initial_channel is greater than 32, it arises.

========= COMPUTE-SANITIZER
========= Error: No attachable process found. compute-sanitizer timed-out.
========= Default timeout can be adjusted with --launch-timeout. Awaiting target completion.
Traceback (most recent call last):
File “/home/uu/@Research/BlockHH/main.py”, line 130, in
main(config)
File “/home/uu/@Research/BlockHH/main.py”, line 100, in main
trainer()
File “/home/uu/@Research/BlockHH/trainer/trainer.py”, line 243, in call
scaler.scale(loss).backward()
File “/home/uu/.pyenv/versions/3.11.1/lib/python3.11/site-packages/torch/_tensor.py”, line 487, in backward
torch.autograd.backward(
File “/home/uu/.pyenv/versions/3.11.1/lib/python3.11/site-packages/torch/autograd/init.py”, line 200, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: CUDA error: misaligned address
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.