Multi GPU training, memory usage in-balance

I found this:

It seems always gathers the output to the first GPU. Is this a temporary solution?