The contribution of random image scales and crops

When training a model with ImageNet the regular data processing pipeline is:

    train_dataset = datasets.ImageFolder(
        traindir,
        transforms.Compose([
            transforms.RandomResizedCrop(224),
            transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
            normalize,
]))

It seems from the source that in each epoch each image would be cropped and resized differently.

Is this true?
Does this randomization procedure is critical for the success of the training (compared to omitting it and having the same image in each epoch)?

Yes, the images are augmented on the fly, so that these transformations are random.
Data augmentation is generally used to create “new” images and thus artificially increasing your training dataset. If you skip the random part of your transformations you won’t end up with data augmentation but rather just pre-processing.
You can read more about it in Goodfellow’s et al.'s Deep Learning Book - Regularization.