How distributed data parallel captures error on one process

Say I am running a 8-gpu distributed data parallel, and one gpu get out of memory. How do the processes of other gpus know that there is one gpu is out of memory?

I am curious about how the mechanism behind it works.