I don’t know someone asked it before or not, but I really wanna to make sure everything I did is correct. Says we have learning rate lr, epoch e, and batch size b as normal setting. And now, we apply it to the ddp based on 2 gpus:
1). If we wanna the batch size keep unchanged by compared with the single card, we don’t change the lr and b = b/2
2). If we wanna apply double batch size, given we have 2 gpus, we don’t need to modify the b, but lr = lr * 2 because of the larger batch size.
3). Should we modify the epoch number?
4). I’m doing a semi-supervised learning, which including two losses (i.e., supervised loss and semi loss). During training, there is a weight that apply for the semi-supervised loss, and we don’t need to change it right?