When I use amp for accelarate the model, i met the problem“RuntimeError: CUDA error: device-side assert triggered”?

I wangt to use amp for mixed-precision training. But when i set amp.autocast is true:


I met the problem as follows:

What puzzles me is i can train the model normally, when i set the amp.autocast is false. So how can i to solve the problem?

Could you rerun the script via CUDA_LAUNCH_BLOCKING=1 python script.py args and post the stadcktrace here by wrapping it into three backticks ```, please?

The same error.

Thanks for the update! The stacktrace points to an invalid indexing operation in:

points[batch_indices, idx, :]

so check the shape of points as well as the min. and max. values of batch_indices and idx and make sure they are valid.

But i can train the model normally, when i set the amp.autocast is false. So i don’t think it is the points’value that is the key to the problem.