Should learning rate changes according GPU number?

Ardeal · August 21, 2021, 12:32pm

Hi,

on 1 GPU, using the parameters batch_size=32 and lr=0.0001, I get a good accuracy.
I would like to use 8 GPU to retrain the model.
My question is:
should be change lr to 0.0001*8 when I use 8 GPU?

James_W · August 22, 2021, 7:21pm

yes ofc LR should change (follow the linear rule in the paper): https://arxiv.org/pdf/1706.02677.pdf