Speed deteriorates when two replicates running with same random seed


I met an interesting problem. When I ran two replicates of my model on two GPUs with CUDA_VISIBLE_DEVICES macro and initialized them with same random seed, the speed of the two training process deteriorates after several iterations.

This phenomenon appears when I use ROI pooling layer implemented by longcw’s faster RCNN.

I guess it may be related the implementation of the ROI pooling layer. When initialized with the same seed, some GPU operations will lead conflicts.

Is there anyone knowing why?


So if you run a single process it’s ok, but if you start two it will run slower? Maybe the GPUs are competing for the PCIe bandwidth

Thank you very much.

Maybe it is due to the competition of PCle bandwidth. But at the beginning tens of thousands of iterations, there is no such problem.

Do you know what I can do to avoid the problem?

The problem always comes with the low usage of GPU usage.

Hey @yikang-li did you train longcw’s code on a custom dataset? If so what were the changes that you made?