Hi, since i’m working on a DLRMs, the gradients and weights are in sparse format, which contains a lot of redundency. I implement a customed all_reduce to support sparse tensor(COO). But it’s hard to replace the default all_reduce since it is triggered by autograd. Is there any way to do that?