I am wondering why CPU inference time varies for Vgg16 and ResNet18. I am using the following script to measure the inference time on CPU for three different modes which I did train from scratch for my custom dataset.
inference time: ResNet18 = 12.88 millisecond, Vgg16 = 66.85 millisecond, and my propsoed model = 11.72 milisecond
Also, the number of parameters for each model are as follows:
ResNet18 : 11.1 M
Vgg16: 13.4 M
proposed model: 33 K
The question is why ResNet18 with 11.1 M parameters takes ~13 ms, however, the proposed model with 33 K takes ~12 ms?
P.S. I measure inference time for one image, 100 times, and then I report the average.
Here is my snippet: Am I missing something here?
from ResNet_model_ResNet18 import model
#from vgg_model_vgg16 import model
import torch
import torch.optim as optim
import torch.nn as nn
import time
from PIL import Image
import torchvision.transforms.functional as TF
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=1e-4)
def main():
epoch = 800
PATH = 'C:/test_inference_time/resnet18_epoch{}.pth'.format(epoch)
checkpoint = torch.load(PATH, map_location = 'cpu')
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
epoch = checkpoint['epoch']
model.eval()
with torch.no_grad():
for step in range(0, 100):
t0 = time.time_ns()
width = 32
height = 32
t_image = Image.open('C://test_inference_time//test//01a1d54e8b64e81d_b44.png')
t_image = t_image.resize((width, height), Image.BILINEAR)
t_image = TF.to_tensor(t_image)
t_image = t_image.unsqueeze_(0)
t_image = t_image #.to(device)
t1 = time.time_ns() - t0 #time of load image
logits= model(t_image)
_, predicted = torch.max(logits, 1)
t2 = time.time_ns() - t0 # time of apply model and prediction
t3 = t2 - t1
print('{:.00f} minutes'.format((t3) / 6e10), '{:.00f} second'.format((t3) / 1e9), '{:.00f} milisecond'.format((t3) / 1e+6), '{:.00f} microsecond'.format((t3) / 1000),'{:.00f} nanosec'.format((t3)% 1e9))
if __name__ == '__main__':
main()
Also, I did measure inference time on GPU for the same models, and I am wondering to see inference time for ResNet18 = 10.21 ms, Vgg16 = 5.49 ms, and proposed model = 4.76 ms
Thanks in advance,
Neda