I’m getting the following error:
RuntimeError: [enforce fail at CPUAllocator.cpp:65] . DefaultCPUAllocator: can’t allocate memory: you tried to allocate 2454416848 bytes. Error code 12 (Cannot allocate memory)
Here is the relevant part of the code that triggered it:
from multiprocessing import Pool
pool = Pool(70)
start = time.time()
print("length of lines: ", len(lines))
line_tensors = pool.map(process, lines)
print("lines processed")
print(time.time() - start)
line_tensors = [x for x in line_tensors if x is not None]
size = 0
for l in line_tensors:
size += l.shape[0]
print("total size: ", size)
megatensor = torch.cat(line_tensors, dim=0).flatten()
The total size that is printed is 306802106. The error occurs on the last line:
Traceback (most recent call last):
File "bert.py", line 73, in <module>
megatensor = torch.cat(line_tensors, dim=0).flatten()
RuntimeError: [enforce fail at CPUAllocator.cpp:65] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 2454416848 bytes. Error code 12 (Cannot allocate memory)
I thought it’s OOM, but I’m pretty sure I have 2.5G on my machine. This similar (?) piece of code also seems to run fine:
import torch
x = []
for i in range(100):
x.append(torch.randint(1, 10000, (100000000,)))
print(i)
mega = torch.cat(x, dim=0).flatten()
Any thoughts on what might be wrong here?