Why batch loading time varies for different batches (Image Loading)?

ptrblck · April 7, 2021, 9:02pm

Yes, it seems that the data loading time is large compared to the actual model training time, which would explain the peaks in the data loading time.

There might be a potential improvement, if replacing:

torch.tensor(self.datasets[index][1], dtype=torch.long)

which will trigger a copy with:

torch.from_numpy(self.datasets[index][1]astype(np.int64))

which would reuse the underlying data (assuming self.datasets returns a numpy array).

Yes, that’s correct and you would have to either speed up the data loading (or increase the workload for the model training).
Note that, the smaller the GPU workload is, the more likely you’ll hit a data loading bottleneck.
In the extreme case that your model immediately trains an iteration (just remove the model training), you would have to make sure that the data loading is fast enough to load and process the next batch while Python “executes the loop” (which would be fast as there is no real workload).