Threading and Dataloader

Loading data from dataloader requires too much time. I would like to have two processes running in parallel. One that load data into batches and put them into a shared queue and the other one that performs the training using GPU. In this way I could fully utilize the GPU without waiting for the loading of the data. Is it possible?

The class should do the trick.

Following blog could be useful :

When I iterate the dataloader:

'for idx, sample in enumerate(dataloader)

Training steps

The first line of code takes one minute to load the data. I would like to have loading of the data and training in parallel