PVA net weird benchmark

Hi there,

I used want to use PVANET for segmentation usage, thus I trained the net on my Titan X, however, the time seems to be slower than Res18 … (the input size is 224*224, the batch size is 1, PVANET takes 20ms while res18 takes 5ms) However, on the paper, it should be faster than Res18(though it only noticed Res101)

I first thought it may be related to the GPU usage , so I changed the input size from 224 to 996, and the batch size from 1 to 32. And compared between Res101 and PVANET, the time of both nets for one batch are the same: 0.45s.

Here is my prototxt for PVANET (with a few modification), and for time benchmark, I only measured the time of one line command “out = model(input)”.
using time.time()

when benchmarking GPU you need to insert torch.cuda.synchronize() before start = time.time() and before end = time.time(). Otherwise you wont get correct timing.