The questions about gathering grad of variable in different gpus

Hi, when the network begin to backward using multi-gpu, where could I find the part of gathering grad in source code? Thanks very much.