Image loading: Manual resize or using resize in Transform

phphuc · May 23, 2023, 2:56am

Hi everyone,
I am currently working with a dataset over 200,000 images in 4K resolution. I wonder if I should manually resize image (copy dataset but smaller size) before training or I could just use Resize in module Transform.
Thank you in advance.
P/s: I’m rather new to PyTorch. I hope to receive your all explanations.

ptrblck · May 23, 2023, 4:31am

Both would work and I would claim it depends on your use case and in particular if you want to randomly crop the images before resizing them etc.
I would not expect to see a huge overhead of resizing the images on the fly, but this would also depend on your system.

Kapil_Rana · May 23, 2023, 5:37am

Both are fine and will work. I would suggest if you have specific resizing and good HDD space, do it prior to the training. It will save lot of time as at every the model will take time in resizing.

phphuc · May 24, 2023, 6:17am

In my case, these images are simply the input of training model to classify without any further augmentation. I am in question of whether the time to load and resize (4K) images in every batch is considerable to the training time.
(I’m sorry for my slow response.)

phphuc · May 24, 2023, 6:23am

Thank you for your response. I am on the side of the preprocessing idea, too. I don’t have enough time to experiment, so if you have some insights about the pros and cons of these two approaches, I hope to hear more from you.