How to measure the executive time (training and inference)

Hi everyone,

We aim to measure the executive time in the GPU mode. We take Mnist as an example. The code is as follows.

109     for epoch in range(1, args.epochs + 1):
110         # start time
111         torch.cuda.synchronize()
112         since = int(round(time.time()*1000))
113         train(args, model, device, train_loader, optimizer, epoch)
114         torch.cuda.synchronize()
115         time_elapsed = int(round(time.time()*1000)) - since
116         print ('training time elapsed {}ms'.format(time_elapsed))
118         # start time
119         torch.cuda.synchronize()
120         tsince = int(round(time.time()*1000))
121         test(args, model, device, test_loader)
122         torch.cuda.synchronize()
123         ttime_elapsed = int(round(time.time()*1000)) - tsince
124         print ('test time elapsed {}ms'.format(ttime_elapsed))

The result is as follows.

training time elapsed 13325ms
testing time elapsed 2115ms

However, we find that there is no difference between training time and inference time when we caculate the average time (total time/num_samples). That is, 13325/60000 and 2115/10000. We are confused with the fact.

I think your basic timing setup makes sense (congrats!).

You’d have to figure in things such batch size, number of synchronizations during training etc.
For MNIST, the main thing might not be the actual compute, but the latency of getting things to the GPU and getting results back to the CPU. If you then have the same batch size, you would expect proportional results regardless of the extra computation the backward takes.

Best regards