How do I let DDP quit if one rank has OOM error

In a distributed data parallel job, due to the special format of data batches, some ranks may receive more data samples than the others. When these “heavy” ranks encounter out of memory issue, the job just hangs there. I think the “light” ranks are waiting for the “heavy” ones to finish. Is there a way to signal all the ranks so they can all quit?