I have built a FCN-8s network for semantic segmentation with pytorch. When I evaluate networks inference speed, I found something strange as followed:
I define the network firstly:
model = FCN32s(NUM_CLASSES=20)
model = model.cuda()
model.eval()
Next, I define a tensor for input, and then run the network for 200 times and calculate the average inference time. Here something strange happened.
When I define the input tensor outside the loop as followed, the average inference time is 0.017s (i.e. 58.8 FPS).
batch = torch.FloatTensor(1, 3, 512, 1024)
batch = batch.cuda()
inputs = Variable(batch, volatile=True)
for i in range(1,201):
pre_time = time.time()
outputs = model(inputs)
time_used=time.time() - pre_time
However, when I define the input tensor inside the loop as followed, the average inference time is 0.0024s (i.e. 417.7 FPS).
for i in range(1,201):
batch = torch.FloatTensor(1, 3, 512, 1024)
batch = batch.cuda()
inputs = Variable(batch, volatile=True)
pre_time = time.time()
outputs = model(inputs)
time_used=time.time() - pre_time
The results are very different! Why does this difference happen and what is the true inference speed of network? Thanks!