Hi all, I’m working with large datasets and noticed that while numpy
offers a memmap
function to map large arrays from disk to memory without fully loading them, torch doesn’t seem to have a similar feature. Is there a particular reason for this?
Whereas I’m not a core developer, I’d say pytorch is not focused on providing data engineering tools. What’s the point of maintaining such a tool when it’s already implemented for parquet, numpy, hdf5 and many other frameworks?