Memory mapped GPU tensor

Hi all,

I would like to have a memory-mapped tensor on the GPU, exceeding the GPU’s memory, backed by a file on disk. I was partially successful in doing that on CPU, like that

import torch
s = torch.DoubleStorage("a_path", shared=True, size=1000000000000)
t = torch.DoubleTensor(s)
t += 1

The actual computation would work fine and the data successfully synced to the file, however, at the end of the operation the program would die with a bus error (might as well be a problem with the system, not sure).

Doing the same for a CUDA tensor would die with

RuntimeError: not available yet for THCStorage at /pytorch/aten/src/THC/generic/THCStorage.cpp:84

My understanding of the C++ API of PyTorch is very limited (yet?). Hence, I would want to avoid digging deep in the C++ implementation. Is there a way that I could provide a, potentially slow, object to a torch.cuda.Tensor where I could implement the memory swapping myself. If so, how? Would it need to implement the buffer interface, the iterator protocol, …?

Cheers and thanks,
Markus