I’m working on a dataset with images of different aspect ratios and different sizes.
The model is quite large and takes a long time to train so I’m using multiple GPUs.
I’m wondering what is the “correct” or at least “reasonable” way to do that.
Should I just resize each batch to be of the same size? Where should it be done - in my Dataset? in collation?
Should my Dataset return a single image, or already batch them and return tuple(images_batch, result_batch)?
Should I calculate the prediction loss and accuracy and other statistics inside the forward pass and just return it? or should it happen outside after recollating them?
I’m looking for common practice.
Resizing your images is kind of “mandatory”. You would need to pad them otherwise. It’s up to you and there are several ways in the literature.
For sure, Dataparallel requires a tensors of same size.
If you are using pytorch dataloader you can wheter to preprocess your dataset or to apply resizing in the getitem function.
Your dataset should return whatever you code it to do. PyTorch dataloader deals with multiprocessing and generates batches stacking in the dim 0. If you use it, dataloader only have to load one sample. If you don’t use it, up to your code.
You can compute loss and accuracy wherever you want. It’s easier to code it outside the forward, as it does not strictly belong to the network.