How to scale/warmup the learning rate for large batch size?

wanchaol · March 22, 2022, 6:17am

@caesar025 thanks for posting!

There’s some previous discussions about how to adjust learning rate when scaling up batch size, did you try it already? Should we split batch_size according to ngpu_per_node when DistributedDataparallel - #19 by junb