In one of my training loops I’m logging quite a bit of debug information after every parameter update. I’ve noticed that when profiling this loop with cPofile
, string formatting takes up a significant chunk of the total training runtime (something like 5%).
That doesn’t seem right to me, since the forward and backward pass take around 2-3 seconds on every iteration. Is there some way I can run the logging in parallel with the backward pass? (and is that likely to solve the problem).
My training loop looks something like this:
for img in dataloader:
img = img.to('cuda')
optimizer.zero_grad()
pred, label = network(img)
loss = self.loss(pred, label)
loss.backward()
optimizer.step()
fmt = "loss was {:1.3e}"
msg = fmt.format(loss)
logger.info(msg)
# some more complicated logging stuff