DataLoader hangs with custom DataSet

You could profile the DataLoader (with num_workers>0) and check, if you are seeing spikes in the data loading time. If so, it would point towards a data loading bottleneck, which would cause the training loop to wait for the next available batch.
This post explains common bottlenecks and proposes some workarounds, in case you are indeed seeing this issue.

1 Like