@caesar025 thanks for posting!
There’s some previous discussions about how to adjust learning rate when scaling up batch size, did you try it already? Should we split batch_size according to ngpu_per_node when DistributedDataparallel - #19 by junb
@caesar025 thanks for posting!
There’s some previous discussions about how to adjust learning rate when scaling up batch size, did you try it already? Should we split batch_size according to ngpu_per_node when DistributedDataparallel - #19 by junb