Input spatial size for feeding pretrained ResNet50 model from torchvision

I need to use ResNet50 pretrained model from torchvision library but it is not clear that what is the required height and width size of the input. Only in one of its introductionary section it is mentioned that 'All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (3 x H x W), where H and W are expected to be at least 224.’ and it is ambiguous. It is not clear that the input size should be at least (224,224) or any spatial position upper thatn 224 is (square or rectabgular size) accepted?