Pytorch efficient data loading and model training techniques

Hello! I am working with a dataset of around 100K images ,All images are of different rectangular shapes and I tried transforms.Resize(),But it distorts most of the images and therefore I settled for a training loader with a batch size of 1 and using gradient accumulation , with images as their original size ,but even with num_workers : 0 and pin_memory : True the speed up gain after first epoch is almost negligible, I assume this is because of high resolution of images because with the same setting for smaller images it worked faster .I want to know about any other approaches I can use to speed up the training(using PyTorch) as it takes more than 45 mins to just complete one epoch.