Why doesn't torch have a memmap-like function?

Rain_river · January 7, 2025, 9:54am

Hi all, I’m working with large datasets and noticed that while numpy offers a memmap function to map large arrays from disk to memory without fully loading them, torch doesn’t seem to have a similar feature. Is there a particular reason for this?

JuanFMontesinos · January 7, 2025, 11:11am

Whereas I’m not a core developer, I’d say pytorch is not focused on providing data engineering tools. What’s the point of maintaining such a tool when it’s already implemented for parquet, numpy, hdf5 and many other frameworks?