I have built a FCN-8s network for semantic segmentation with pytorch. When I evaluate networks inference speed, I found something strange as followed:

I define the network firstly:

**model = FCN32s(NUM_CLASSES=20)**

**model = model.cuda()**

**model.eval()**

Next, I define a tensor for input, and then run the network for 200 times and calculate the average inference time. Here something strange happened.

When I define the input tensor outside the loop as followed, the average inference time is 0.017s (i.e. 58.8 FPS).

**batch = torch.FloatTensor(1, 3, 512, 1024)**

**batch = batch.cuda()**

**inputs = Variable(batch, volatile=True)**

**for i in range(1,201):**

**pre_time = time.time()**

**outputs = model(inputs)**

**time_used=time.time() - pre_time**

However, when I define the input tensor inside the loop as followed, the average inference time is 0.0024s (i.e. 417.7 FPS).

**for i in range(1,201):**

**batch = torch.FloatTensor(1, 3, 512, 1024)**

**batch = batch.cuda()**

**inputs = Variable(batch, volatile=True)**

**pre_time = time.time()**

**outputs = model(inputs)**

**time_used=time.time() - pre_time**

The results are very different! Why does this difference happen and what is the true inference speed of network? Thanks!