RuntimeError: unable to mmap 23104 bytes from file: Cannot allocate memory

I’m trying to load all the dataset image files into memory but I’m facing the following error:

Exception in thread Thread-3 (_handle_results):
Traceback (most recent call last):
  File "/home/mehran/.conda/envs/gtn_env/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/home/mehran/.conda/envs/gtn_env/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/home/mehran/.conda/envs/gtn_env/lib/python3.10/multiprocessing/pool.py", line 579, in _handle_results
    task = get()
  File "/home/mehran/.conda/envs/gtn_env/lib/python3.10/multiprocessing/connection.py", line 251, in recv
    return _ForkingPickler.loads(buf.getbuffer())
  File "/home/mehran/.conda/envs/gtn_env/lib/python3.10/site-packages/torch/multiprocessing/reductions.py", line 514, in rebuild_storage_filename
    storage = torch.UntypedStorage._new_shared_filename_cpu(manager, handle, size)
RuntimeError: unable to mmap 23104 bytes from file </torch_6855_1827127219_18231>: Cannot allocate memory (12)

I understand that this error is complaining about the lack of memory but I’m monitoring my memory usage and it’s hardly half used. While the code is too complex to mention here completely, here are some snippets that might help you help me:

torch.multiprocessing.set_sharing_strategy('file_system')

def load_image(file_path, data_path):
    transform = v2.Compose([
        lambda x: x.convert('L'),
        lambda x: TF.to_tensor(x),
        lambda x: x * 255.0,
        lambda x: x.type(torch.int8),
    ])
    sample = Image.open(os.path.join(data_path, file_path))
    image = transform(sample)
    return image

with mp.Pool(processes=16) as pool:
    samples = pool.map(partial(load_image, data_path=data_path),
                            filenames)

print("Dataset loaded")

This error happens when I cast the tensors into int and I do that since if I don’t the dataset won’t fit into my memory. In that scenario, I can see the memory usage reaching my physical memory limits and that’s totally acceptable. But in this case, as I said, it hardly reached half my machine’s capacity and it errors out.

The funny thing is that after this error is raised, the code does not exit. It continues going but the CPU usage goes down (only one core will be 100% while previously all cores where engaged) and the script does not reach the line after.

I tried splitting the list into smaller chunks and processing the chunks one at a time:

L = 100000
sub_filenames = [filenames[i:i+L] for i in range(0, len(filenames), L)]
samples = []
for chunk in sub_filenames:
    with mp.Pool(processes=16) as pool:
        samples.extend(pool.map(partial(load_image, data_path=data_path),
                                        chunk))

Still the same error occurs. This time much sooner and less memory is used in total too. It seems to me this has nothing to do with the amount of memory consumed.

I even lowered the number of processes to 4 and while most of the CPU cores where engaged but not 100% all the time (maybe 95%). And yet, the same error occurs.

For anyone else who might be facing similar problem, this is the solution. Leave the conversion of your images to tensor to the very last step of your transformation.

def load_image(file_path, data_path):
    sample = Image.open(os.path.join(data_path, file_path))
    sample = sample.convert('L')
    return sample

samples = []
with mp.Pool(processes=16) as pool:
    samples.extend(pool.map(partial(load_image, data_path=data_path),
                            file_names))

And then, move all your transformations to when you are returning the sample in your dataset object:

tranformation = v2.Compose([
    v2.RandomResizedCrop(size=(64, 64), antialias=True),
    v2.GaussianBlur(3),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.912], std=[0.168]),
])

Up until transforms.ToTensor(), the object will hold a PIL format. It will be converted to a Tensor when it passed through transforms.ToTensor(). Apparently, this is the part that the multiprocessing package does not like.