How to gather data from all GPUs for DataParallel?

Hi,

In my forward pass, I am trying to gather the features from a CNN model to compute the class prototypes from the current batch. I can do that for DistributedDataParallel easily using the example given by ‘MOCO’. But I am not sure how to do it for DataParallel. It would be great if someone can give me some pointers for that.

TIA.