Correct way to store image in pytorch

JBlanchon · October 14, 2021, 7:36am

Hi, I’m coding an EuroSat torch VisionDataset.
My __getitem__() method return Tuple[Any, Any]. My images are already store in torch.Tensor before the transformation (for RGB image I use torchvision.io.read_image that directly return Tensor and for full channel image I use gdal). But I see many torchvision dataset that store images as numpy array or even PIL Image. It seems to be very common as many people apply a T.ToTensor() on the dataset.

So what is the best practice in pytorch, directly using torch.Tensor and avoiding a ToTensor transform or using np array ?

akvilonBrown · October 14, 2021, 2:25pm

Hi, I’m not knowledgeable enough about performance/size aspects. Still, if you will share your dataset with someone or reuse it later with other frameworks, it would be inconvenient to install PyTorch just to read it.

OrielBanne · October 14, 2021, 2:29pm

depends what you try to do. if you are on GPU or Torch in general with no need to plot the image - keep it in torch tensor. But when you want to actually see the image, a tensor will not be the sort of image you can plot. PIL can be plotted, and various other formats as well.

tom · October 14, 2021, 3:22pm

If you have tensors and expect to read it in one go, I think it is very reasonable to just store them as tensors. It is a speed thing, too. Going from read from disk + CPU augmentation to store in memory + GPU augmentation can give a formidable speedup.
(I did this as live coding in a workshop last week - in the grapevine notebooks.)
You won’t be able to read .pt files without PyTorch. To my mind, this is not too bad. They do have the advantage that they promise libtorch compatibility when you don’t have Python.
You can always use torchvision’s ToPIL transform if you want to show images. Typically, you have magnitudes more images going straight to the GPU than are being output.
There is a small caveat: Depending on the hardware characteristics, I have seen that loading smaller files + doing some decoding (preferably on the GPU) can be faster than loading larger files that have been preprocessed. (I saw this with ct scans and int16 storage vs. fp32 and the storage interface was the main limitation, so reading half as many bytes from disk was preferable.)

Best regards

Thomas

JBlanchon · October 14, 2021, 4:41pm

Thanks Tom,

So I suppose even if most people and most pytorch dataset return numpy ndarray it’s a always a better practice to directly use torch tensor.

I looked at your workshop, it is very well written. Thanks a lot