RuntimeError: `initialize_buckets` must NOT be called during autograd execution

I wanna use ddp to train my model, while I encounter this issue:

RuntimeError: initialize_buckets must NOT be called during autograd execution.

Could anyone give me help ?

This error will appear at the end of the second epoch, and the first epoch is good.

Could you please share a self-contained reproducible source file for thie problem?

A repro would be helpful, and in addition DistributedDataParallel cannot handle multi-device runs with non-evenly divisible batch sizes · Issue #46175 · pytorch/pytorch · GitHub might also provide valuable context.

Are you using single GPU per process or multiple GPUs per process? The latter mode is deprecated and no longer maintained by PyTorch and we suggest moving to single GPU per process which is the more performant and supported way going forward.