Efficient loading of Tensor too large for RAM (for training)?

I have very large Tensors which I use as input for training my model (talking ~900GB size), and of course these are too large for loading to RAM. I was using TensorDataset(torch.FloatStorage) which worked well but is apparently extremely slow on the Lustre filesystem that I have to use.

Are there any alternatives that I have not considered? For my applciation, I need the data to be accessible by index (as is the case with TensorDataset).