I am trying to follow the data augmentation practice in the original ResNet paper Deep Residual Learning for Image Recognition, which includes:
The image is resized with its shorter side randomly sampled in [256, 480] for scale augmentation . A 224×224 crop is randomly sampled from an image or its horizontal flip, with the per-pixel mean subtracted . The standard color augmentation in  is used.
Here, the random resize is explicitly defined to fall in the range of [256, 480], whereas in the Pytorch implementation of
RandomResizedCrop, we can only control the resize ratio, i.e., a range of scaling the images no matter what the resulting size is. While it seems reasonable to do so to keep the resolution consistent, I wonder:
are there ways to make
RandomResizedCropbehave in the “explicit resizing” way? (if not I’ll consider implement my own)
what’s the essential reason to ditch the “explicit resizing” from the original paper? are papers nowadays all preferring the “ratio resizing” version?