DataLoader size limit in torch.load()?

ipostr08 · January 9, 2021, 2:21am

I’m attempting to load a 14 GB object using torch.load() under Windows 10 (64-bit), Python 3.8.5, 3090 RTX and PyTorch 1.7.1 and I’m getting an error:

Exception has occurred: TypeError
get_storage_from_record(): incompatible function arguments. The following argument types are supported:
    1. (self: torch._C.PyTorchFileReader, arg0: str, arg1: int, arg2: object) -> at::Tensor

Invoked with: <torch._C.PyTorchFileReader object at 0x00000248E74CC970>, 'data/2588766380832', -785689216, torch.float32
  File "U:\endgame\datasetLoader.py", line 40, in loadOrCreateDatasets
    trainLoader = torch.load(savedFnameTrain)

The object was saved using torch.save(). On the other hand, loading a 3.5 GB object works fine. Is there a size limitation at play? Any recommended workaround?

I’ve attempted to load a Dataset (18 GB) and convert it into a DataLoader but this also fails:

Exception has occurred: RuntimeError
[enforce fail at ..\c10\core\CPUAllocator.cpp:48] ((ptrdiff_t)nbytes) >= 0. alloc_cpu() seems to have been called with negative number: 18446744070566794752
  File "U:\endgame\datasetLoader.py", line 42, in loadOrCreateDatasets
    demoSetTrain = pickle.load(open(savedFnameTrainDS, "rb"))

A similar report from March 2020: Using torch.save and torch.load on very heavy files

InnovArul · January 9, 2021, 7:14am

For large files, a typical workaround is to use HDF5 format.
Check if it suits your needs.

ptrblck · January 18, 2021, 7:34am

The issue might also be related to this one.
Could you try to suggestion by using pickle protocol 4?