I am puzzling over the timing info from the following simple loop with data loader. The observation is that tdata is more than half of titer. Is there any kind of synchronization happening in the data loader that waits for the backward pass to finish ?
print("iter: ", “batch_idx”, “tdata”, “tfwd”, “tloss”, “tbwd”, “titer”, “titercheck”, “args.batch_size/titer”, “args.batch_size/(titer-tdata)”, “time.time()” )
extralong = []
time00 = time.time()
for batch_idx, (data, target) in enumerate(train_loader):
data = data.to(args.device)
target = target.to(args.device)
if args.nhwc:
data = data.to(memory_format=torch.channels_last)
tdata = time.time() - time00
optimizer.zero_grad()
output = model(data)
tfwd = time.time() - tdata - time00
loss = criterion(output, target)
tloss = time.time() - tdata - tfwd - time00
loss.backward()
optimizer.step()
tend = time.time()
tbwd = tend - tdata - tfwd - tloss - time00
titer = tdata + tfwd + tloss + tbwd
titercheck = tend - time00
time00 = time.time()
if True:
print("iter: ", batch_idx, tdata, tfwd, tloss, tbwd, titer, args.batch_size/titer, args.batch_size/(titer-tdata), len(extralong), time.time() )
print("iter: ", batch_idx, tdata, tfwd, tloss, tbwd, titer, args.batch_size/titer, args.batch_size/(titer-tdata), len(extralong), time.time() )