How to deallocate the DDP gradient buckets?

How can I deallocate the DDP gradient buckets to reduce memory footprint?

In my model, I want to use customized buckets for gradient bucketing instead of DDP gradient buckets. Can I deallocate DDP gradient buckets that are roughly the same size as the model parameters?

Thanks.

Curious how you do the custom bucketing, maybe you are using ddp comm hook to implement a custom bucketing? One suggestion maybe using views of gradient bucket, instead of doing clones, this might help saving some memory.