Running the code below devours all available RAM:
import numpy as np
from torchvision import transforms
transform = transforms.Compose([transforms.ToTensor(), transforms.Resize(256)])
X = np.full((1,256,256),0)
Y = transform(X)
I was making custom masks for a segmentation problem and wanted constant tensors with various values for reasons. I don’t need to resize the thing, but I wondered why this was blowing up. According to the docs, I don’t see a problem turning an array into a tensor and then resizing.
Images stored as numpy arrays are expected in the channels-last memory layout.
For a grayscale image you could thus pass
[256, 256] or
[256, 256, 1], which will return the expected output.
In the current format,
torchvision assumes that your image has a height of 1, a width of 256, and 256 channels.
ToTensor() operation will return a tensor of
[256, 1, 256], which will then be passed to
Resize assumes the wrong layout and will try to increase the “small side” (which is the height of 1) to 256 and scale the long side with the same factor resulting in a tensor size of:
channels * height * width * 4
= 256 * 256 * (256*256) * 4 = 17179869184 bytes = 16GB
which fits perfectly the error message:
DefaultCPUAllocator: not enough memory: you tried to allocate 17179869184 bytes. Buy new RAM!
Thanks, channel first/second/last issues have definitely fooled me more than once.