DataParallel, data may have different sizes

chengyangfu · March 20, 2017, 8:32pm

Hi,
This is a question or recommendation about the inputs of DataParallel.

Currently, the DataParallel forces the inputs having same size. But is it possible to support data with different sizes?

I am using PyTorch to implement my detection and segmentation frameworks (computer vision). For example, in each image, the number of objects are different. My current solution is to add some dummy to make the annotations of different images in the same batch have same dimensions. I think this works bounding boxes but not a good solution for segmentation masks.

I would like to hear the better solution for this kind of problems.

apaszke · March 21, 2017, 2:17pm

I don’t think there’s any general way in which we could add support for batches of inputs of different sizes. You can always subclass DataParallel and override the scatter method, so that it splits the data differently.