Yeah, the StackOverflow explanation is wrong. In general, it seems that there aren’t many places to get better PyTorch advice than on these forums here.
The TorchVision documentation explains:
- If size is a sequence like (h, w), output size will be matched to this.
- If size is an int, smaller edge of the image will be matched to this number. i.e, if height > width, then image will be rescaled to (size * height / width, size)
The idea of the latter is to not ruin the aspect ratio of the image and follow by a cropping operation to the final size.
Best regards
Thomas