Additional logs when update pytorch version to 1.7 and using Distribute Training

Hi all, I found logger.info will print additional info when using Distribute training (DDP).
When using pytorch version of 1.6.Calling logger.info may give:

Epoch 1, Node 0, GPU 3, Iter 300, Top1 Accuracy:0.083602, Loss:5.4438, 132 samples/s. lr: 0.64553.
Epoch 1, Node 0, GPU 6, Iter 300, Top1 Accuracy:0.086197, Loss:5.4218, 129 samples/s. lr: 0.64553.
Epoch 1, Node 0, GPU 1, Iter 300, Top1 Accuracy:0.084302, Loss:5.4295, 86 samples/s. lr: 0.64553.
Epoch 1, Node 0, GPU 2, Iter 300, Top1 Accuracy:0.083498, Loss:5.4405, 130 samples/s. lr: 0.64553.
Epoch 1, Node 0, GPU 0, Iter 300, Top1 Accuracy:0.087469, Loss:5.4242, 131 samples/s. lr: 0.64553.
Epoch 1, Node 0, GPU 5, Iter 300, Top1 Accuracy:0.083731, Loss:5.4297, 127 samples/s. lr: 0.64553.
Epoch 1, Node 0, GPU 7, Iter 300, Top1 Accuracy:0.0856, Loss:5.4234, 130 samples/s. lr: 0.64553.
Epoch 1, Node 0, GPU 4, Iter 300, Top1 Accuracy:0.08355, Loss:5.4322, 86 samples/s. lr: 0.64553.

But when I update pytorch version to 1.7

Epoch 1, Node 0, GPU 3, Iter 300, Top1 Accuracy:0.083602, Loss:5.4438, 132 samples/s. lr: 0.64553.
INFO:Distribute training logs.:Epoch 1, Node 0, GPU 3, Iter 300, Top1 Accuracy:0.083602, Loss:5.4438, 132 samples/s. lr: 0.64553.
Epoch 1, Node 0, GPU 6, Iter 300, Top1 Accuracy:0.086197, Loss:5.4218, 129 samples/s. lr: 0.64553.
INFO:Distribute training logs.:Epoch 1, Node 0, GPU 6, Iter 300, Top1 Accuracy:0.086197, Loss:5.4218, 129 samples/s. lr: 0.64553.
Epoch 1, Node 0, GPU 1, Iter 300, Top1 Accuracy:0.084302, Loss:5.4295, 86 samples/s. lr: 0.64553.
INFO:Distribute training logs.:Epoch 1, Node 0, GPU 1, Iter 300, Top1 Accuracy:0.084302, Loss:5.4295, 86 samples/s. lr: 0.64553.
Epoch 1, Node 0, GPU 2, Iter 300, Top1 Accuracy:0.083498, Loss:5.4405, 130 samples/s. lr: 0.64553.
INFO:Distribute training logs.:Epoch 1, Node 0, GPU 2, Iter 300, Top1 Accuracy:0.083498, Loss:5.4405, 130 samples/s. lr: 0.64553.
Epoch 1, Node 0, GPU 0, Iter 300, Top1 Accuracy:0.087469, Loss:5.4242, 131 samples/s. lr: 0.64553.
INFO:Distribute training logs.:Epoch 1, Node 0, GPU 0, Iter 300, Top1 Accuracy:0.087469, Loss:5.4242, 131 samples/s. lr: 0.64553.
Epoch 1, Node 0, GPU 5, Iter 300, Top1 Accuracy:0.083731, Loss:5.4297, 127 samples/s. lr: 0.64553.
INFO:Distribute training logs.:Epoch 1, Node 0, GPU 5, Iter 300, Top1 Accuracy:0.083731, Loss:5.4297, 127 samples/s. lr: 0.64553.
Epoch 1, Node 0, GPU 7, Iter 300, Top1 Accuracy:0.0856, Loss:5.4234, 130 samples/s. lr: 0.64553.
INFO:Distribute training logs.:Epoch 1, Node 0, GPU 7, Iter 300, Top1 Accuracy:0.0856, Loss:5.4234, 130 samples/s. lr: 0.64553.
Epoch 1, Node 0, GPU 4, Iter 300, Top1 Accuracy:0.08355, Loss:5.4322, 86 samples/s. lr: 0.64553.
INFO:Distribute training logs.:Epoch 1, Node 0, GPU 4, Iter 300, Top1 Accuracy:0.08355, Loss:5.4322, 86 samples/s. lr: 0.64553.

I’m sure I only update pytorch version and do not change any package.
Who print the addition info and how to stop this.

@PistonY Do you mind sharing the part of the training script that prints these logs? I don’t see any code in PyTorch itself that would generate these extra logs.

@osalpekar hi, thx for reply.
I use this script and this line will print the info.

@PistonY Looking at the additional logs, isn’t it coming from here: https://github.com/PistonY/ModelZoo.pytorch/blob/fd85403bb1430ce591a97586a097c989630aa82b/scripts/utils.py#L41 instead of PyTorch?

hi, @pritamdamania87 thanks for reply, but this line don’t print anything when using pytorch version <= 1.7.0, I didn’t modify any code but update pytorch version, do you know how to stop it?

I think the upgrade to PyTorch might be unrelated here and something else is changing that is causing additional logging. These are log messages generated by your application and not PyTorch, probably double checking the logging configuration might reveal the reason for this.

I don’t think it’s my issues.
I config logs as bellow.

def get_logger(file_path):
    streamhandler = logging.StreamHandler()
    logger = logging.getLogger('Distribute training logs.')
    logger.setLevel(logging.INFO)
    logger.addHandler(streamhandler)
    return logger

I tried this still has additional logs.
Could you please give some suggests @osalpekar?

I remove the name of logging.getLogger and no additional logs print.
But print this.
Reducer buckets have been rebuilt in this iteration.
I don’t know what happened, but finally logs is clear,