Hi, I found an interesting thing, and I am not sure whether it is my mistake or not.
When training a model, in one epoch for many mini-batches, the time for images going through model of the first mini-batch is 2.54s, and then 0s for the next 10 mini-batches(i did not see the time for all 851 mini-batches, only 10). Please see the code:
for batch_idx, (imgs, ...) in enumerate(train_loader):
optimizer.zero_grad()
image = imgs.to(device)
time1_start = time.time()
x, y= model(image)
time1_end = time.time()
print('time1:%.2f' % (time1_end - time1_start))
# the time for printing is 2.54s, 0.00, 0.00, 0.00, ......
So it means after the first mini-batch, the time for images going through model is nearly 0.
As a student, I am curious and want to learn the reason.
Thanks.