Train a convnet with images of different size?

I would like to train a Resnet-18 network on GPU using different sized images, and I do not want to resize the images nor pad. The network allows the task, as it contains an AdaptiveAvgPooling layer.

The problem I’m stuck with is how to create a DataLoader that loads different sized images.
It appears from a number of posts on this that a custom collate_fn function should be used instead of the default one in the DataLoader.

Can anybody direct me in the right direction in how to create such a DataLoader for different sized images that can be used for training ?

Thank you very much :slight_smile:

All resnets start with Conv2d layer, expecting tensor with predefined shape (batch_num, num channels, H, W). You can pass images with arbitrary shape, yes, but it has to be same shape (HxW) for all images.