Feedig images of different size to vision transformer

I have (not natural) images (~250 images) that I do some sort of preprocessing on, and after reducing their size (they are in range of gigapixels), I end up with images such as the following sizes:

For example, one of my features.pt is torch.Size([2231, 512]) and another is torch.Size([399, 512])

So, when I am feeding these images that are in train dataloader, do I need to normalize them to a specific size, and what size should that be? Or am I fine the way images are? The second item in torch.size is always 256.

Here’s a related post but I thought to create a separate question.