CUDA operations are executed asynchronously so you would need to synchronize the code before starting and stopping the timer via torch.cuda.synchronize()
. The cpu()
operation will synchronize the code in your example so that the printed time might yield the model execution + data transfer.
2 Likes
It helps a lot, thanks.