Will the observer be synchronized across GPUs in DistributedDataParallel?

I’m wondering about the behavior of quantization observer, e.g. HistogramObserver, when the model is wrapped with DistributedDataParallel ?

Will the estimated histograms (or min/max values) synchronized across multiple GPUs?

The observer min/max values are stored as buffers in the module. So they get broadcast from the rank 0 machine when run with DDP. This ensures the values are synchronized across all machines.

Thx for clarify.

Does this mean only a part of samples (samples on GPU-0) are used to estimate the quantization parameters ?

Yes, we use the samples on rank 0 to determine the quantization parameters.