I created my own large dataset. Every time I want to run or train anything the dataset has to be processed. I tried my best to optimize this process but after profiling my run I found out that the function
“file_exists” (torch_geometric.data.dataset — pytorch_geometric 1.4.3 documentation)
is taking up 87% of my runtime with its calls to posix.stat. Is there any clear cut way to optimize this part of the function call? Or anything I can do with the way my files are written or ordered to help out with this?