I am training a rotation detection model rotating the images during training (instead of as a preprocessing stage). This rotation in the cpu becomes the bottleneck (disabling it reduces epoch time from 74s to 6s and drastically improves GPU utilization). I was looking for suggestions to improve my training performance.
My Dataset class looks something like this:
class RotatingDataset(Dataset):
...
def __getitem__(self, idx):
target = np.random.uniform(-180, 180)
image = read_pil_image(idx) #~380x380 image
# relevant bit
image = scipy.ndimage.rotate(image, target, reshape=True, mode='nearest')
image = Image.fromarray(image) # back to PIL
# apply transforms (mainly resize/crop/normalize)
image = self.transforms(image)
return image, target
Some additional detail:
- I’m using
scipy.ndimage.rotate
(instead of PIL ortorchvision.transforms.functional.rotate
) because I wantedfill-mode='nearest'
. - I’ve already implemented the
Dataloader
optimization “tricks” (num_workers
,pin_memory
) - I have already resized images on disk to ~380 to reduce io.
- The main reason I want to rotate on the fly is so every epoch a given image is rotated a different angle to mitigate overfitting.
Thanks for taking the time to read through this! Any suggestion is welcome