Unexpected PyTorch Profiler results

Hello,

I ran the PyTorch profiler to measure the total CPU times for the following models:

Model name CPU total time
ResNet18 37.236ms
ProxylessNAS (CPU) 69.561ms
MobileNetV2 64.838ms

The machine has 14-cores Intel Core-i9 10940X CPU @ 3.30GHz.

Both ProxylessNAS (CPU) and MobileNetV2 are slower than ResNet18.

But I was expecting the opposite.

The script is as follows. Do I miss anything?

import numpy as np
import torch
from torch.autograd import profiler


#model = torch.hub.load('pytorch/vision:v0.6.0', 'resnet18', pretrained=True)
#model = torch.hub.load('mit-han-lab/ProxylessNAS', 'proxyless_cpu', pretrained=True)
model = torch.hub.load('pytorch/vision:v0.6.0', 'mobilenet_v2', pretrained=True) #models.mobilenetv2()

inputs = torch.randn((1, 3, 224, 224))

with torch.no_grad():
    with profiler.profile(record_shapes=True) as prof:
        with profiler.record_function('model_inference'):                                                                                                                                                                                                     
            model(inputs)

print(prof.key_averages().table(sort_by='cpu_time_total', row_limit=10))                                                                                                                                                                                      

The detailed outputs are as follows:

ResNet18

---------------------------  ---------------  ---------------  ---------------  ---------------  ---------------  ---------------                                                                                                                                               
Name                         Self CPU total %  Self CPU total   CPU total %      CPU total        CPU time avg     Number of Calls                                                                                                                                              
---------------------------  ---------------  ---------------  ---------------  ---------------  ---------------  ---------------                                                                                                                                               
model_inference              6.02%            2.241ms          100.00%          37.236ms         37.236ms         1                                                                                                                                                             
conv2d                       0.17%            63.016us         46.30%           17.240ms         861.987us        20                                                                                                                                                            
convolution                  0.20%            75.310us         46.13%           17.177ms         858.836us        20                                                                                                                                                            
_convolution                 0.69%            256.361us        45.93%           17.101ms         855.071us        20                                                                                                                                                            
mkldnn_convolution           44.84%           16.696ms         45.14%           16.810ms         840.478us        20                                                                                                                                                            
batch_norm                   0.21%            78.296us         31.22%           11.626ms         581.318us        20                                                                                                                                                            
_batch_norm_impl_index       0.22%            80.086us         31.01%           11.548ms         577.404us        20                                                                                                                                                            
native_batch_norm            18.85%           7.018ms          30.71%           11.436ms         571.781us        20                                                                                                                                                            
max_pool2d                   0.05%            17.052us         12.65%           4.709ms          4.709ms          1                                                                                                                                                             
max_pool2d_with_indices      12.58%           4.684ms          12.60%           4.692ms          4.692ms          1                                                                                                                                                             
---------------------------  ---------------  ---------------  ---------------  ---------------  ---------------  ---------------                                                                                                                                               
Self CPU time total: 37.236ms                                                          

ProxylessNAS (CPU)

--------------------------  ---------------  ---------------  ---------------  ---------------  ---------------  ---------------  
Name                        Self CPU total %  Self CPU total   CPU total %      CPU total        CPU time avg     Number of Calls  
--------------------------  ---------------  ---------------  ---------------  ---------------  ---------------  ---------------  
model_inference             8.69%            6.046ms          100.00%          69.561ms         69.561ms         1                
conv2d                      0.20%            140.364us        44.30%           30.818ms         505.213us        61               
convolution                 0.20%            137.516us        44.10%           30.678ms         502.912us        61               
_convolution                1.09%            754.904us        43.90%           30.540ms         500.657us        61               
batch_norm                  0.29%            203.942us        43.69%           30.394ms         498.257us        61               
_batch_norm_impl_index      0.31%            214.853us        43.40%           30.190ms         494.914us        61               
native_batch_norm           23.48%           16.333ms         42.96%           29.885ms         489.923us        61               
mkldnn_convolution          42.20%           29.355ms         42.66%           29.678ms         486.520us        61               
select                      13.17%           9.160ms          17.69%           12.303ms         3.191us          3855             
as_strided                  3.23%            2.247ms          3.23%            2.247ms          0.582us          3858             
--------------------------  ---------------  ---------------  ---------------  ---------------  ---------------  ---------------  
Self CPU time total: 69.561ms

MobileNetV2

--------------------------  ---------------  ---------------  ---------------  ---------------  ---------------  ---------------  
Name                        Self CPU total %  Self CPU total   CPU total %      CPU total        CPU time avg     Number of Calls  
--------------------------  ---------------  ---------------  ---------------  ---------------  ---------------  ---------------  
model_inference             7.19%            4.662ms          100.00%          64.838ms         64.838ms         1                
batch_norm                  0.27%            174.627us        50.49%           32.736ms         629.534us        52               
_batch_norm_impl_index      0.29%            186.420us        50.22%           32.561ms         626.176us        52               
native_batch_norm           27.76%           18.001ms         49.82%           32.301ms         621.170us        52               
conv2d                      0.18%            118.212us        38.70%           25.091ms         482.519us        52               
convolution                 0.22%            144.360us        38.52%           24.973ms         480.245us        52               
_convolution                1.23%            800.032us        38.29%           24.828ms         477.469us        52               
mkldnn_convolution          36.46%           23.642ms         36.91%           23.934ms         460.270us        52               
select                      15.08%           9.777ms          20.58%           13.342ms         3.572us          3735             
as_strided                  3.94%            2.553ms          3.94%            2.553ms          0.683us          3738             
--------------------------  ---------------  ---------------  ---------------  ---------------  ---------------  ---------------  
Self CPU time total: 64.838ms
1 Like