Rookie ask: how to speed up the loading speed in pytorch

This is exactly what DataLoader does if you set num_threads > 1 (the name DataLoader is unfortunate in my opinion, since it is really an iterator).

What you will need to do in your case to use DataLoader, is to implement your own dataset and pass it to the DataLoader when you create it. Your dataset will be a class that implements getitem(self, index) (and len(self) ). getitem will load and return a datapoint from your database (or from wherever else you choose to load your data from). You could even have it read from a text file, but you might run into problems with that if using multiple threads.

See the an example dataset here which loads images from directories: