When I use num_workers >0, my threads get extremely slow randomly at the end of each iteration.
Here’s part of my code:
for epoch in range(1, num_epochs): for i, (x_train, y_train) in enumerate(train_batch): # y_train stands for label trainPhase_start_time = time.time() print('trainPhaseEnd-NextTrainPhaseStart:', i, ' ', trainPhase_start_time -trainPhase_end_time) x_train = x_train.type(torch.FloatTensor) y_train = y_train.type(torch.FloatTensor) x_train = x_train.to(device) y_train = y_train.to(device) predict_train = net(x_train) loss = criterion(predict_train, y_train) optimizer.zero_grad() loss.backward() # backward optimizer.step() trainPhase_end_time = time.time() print('trainPhaseStart-trainPhaseEnd:(single batch)', trainPhase_end_time - trainPhase_start_time)
Every 16(num_worker) iterations, data loading may get stuck:
And after like 100 iterations, it may happen irregularity.