How is PyTorch pip version compiled ? (flags, libs, ...) [cpu only]

Hello,
When building pytorch from source (no cuda, no distributed, cpu only), i noticed a big difference in performance from the version i installed via pip (http://download.pytorch.org/whl/cpu/torch-0.4.0-cp27-cp27mu-linux_x86_64.whl ).
After some trials, i managed to get similar performances by linking against mkl lib (installing mkl using pip and then building again).
I would like to know if there is a link to the build script used to create the cpu-only wheel version, or at least which flags/libs are used ?

Example output of the profiler for the different versions (pip, source, source+mkl)

=== Pytorch 0.4.0 (cpu only) pip - python 2.7 ===

Timer unit: 1e-06 s

Total time: 4.231 s
File: extension_test.py
Function: forward at line 36

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    36                                               @profile
    37                                               def forward(self, x):
    38      5001    3640815.0    728.0     86.1          x = self.conv_layers(x)
    39                                                   # print(x.shape)
    40      5001      36261.0      7.3      0.9          x = x.view(x.size(0), -1)
    41      5001     550743.0    110.1     13.0          x = self.fc(x)
    42      5001       3182.0      0.6      0.1          return x

=== Built from source v0.5.0a0+e6f7e18 (master branch) python 2.7 ===

Timer unit: 1e-06 s

Total time: 8.65896 s
File: extension_test.py
Function: forward at line 36

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    36                                               @profile
    37                                               def forward(self, x):
    38      5001    7399796.0   1479.7     85.5          x = self.conv_layers(x)
    39                                                   # print(x.shape)
    40      5001      62640.0     12.5      0.7          x = x.view(x.size(0), -1)
    41      5001    1191392.0    238.2     13.8          x = self.fc(x)
    42      5001       5129.0      1.0      0.1          return x

=== From source with mkl support python 2.7 ==
Timer unit: 1e-06 s

Total time: 4.33177 s
File: extension_test.py
Function: forward at line 36

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    36                                               @profile
    37                                               def forward(self, x):
    38      5001    3731289.0    746.1     86.1          x = self.conv_layers(x)
    39                                                   # print(x.shape)
    40      5001      37394.0      7.5      0.9          x = x.view(x.size(0), -1)
    41      5001     560042.0    112.0     12.9          x = self.fc(x)
    42      5001       3047.0      0.6      0.1          return x

here’s a link to the build script that is used to build the cpu-only wheel:

https://github.com/pytorch/builder/tree/master/manywheel (linux)
https://github.com/pytorch/builder/tree/master/wheel (osx)

3 Likes