Distributed training creates multiple processes in GPU0

Soumya_Sanyal · April 25, 2020, 11:04pm

Found the bug. So we need to be careful with setting the right GPU context while calling clear_cache() function, otherwise it allocates fixed memory on GPU0 for the other GPUs. Relevant issue here.