How to build pytorch from source and get a pip wheel?

see the title. I want to build from source and get a whl, as I don’t want to install those CUDA, MKL, etc. as conda packages, and want PyTorch to be self-contained. Thanks.

Currently, I can only install PyTorch as pip wheel using precompiled binaries from the official website. This prevents me from avoiding some bugs that get fixed in the master branch.


you can follow the usual instructions for building from source and call bdist_wheel

instead of install.
This will put the whl in the dist directory.

Best regards


1 Like

So that would give me a similar wheel as those in the official website, with everything self-contained, right?

here is the script to build the wheel:

1 Like

Thanks. So what are the dependencies? Just an environment with conda installed (plus some new enough gcc, say 4.8)?

yes, thats about it.

1 Like

Hi Soumith,

I checked some parts of the builder script.

Although I haven’t tested it, it seems that, the CUDA and cuDNN is not handled by conda, and instead they are installed in system-level directories.

Seems that I should get myself an environment like that in, and then run scripts under wheel?

yes you are correct. you will need this environment:

1 Like

@smth @zym1010 excuse me. I am facing the same problem these days so I hope you can help me. Since I am working on an offline machine with ubuntu16.04, I can’t directly use the builder script in the reposity smth offered. Instead, I write a new one according to In this script, the enviroment variables and the deps list maintains the same.

The problem is, using this script, I can obtain a .whl file and can install it in another machine (with cuda installed). However, I find that it will take a long time (about 3 minutes) in the procedure “lambda t: t.to_cuda()” when call module.apply(). I guess that the reason may be the cuda dependencies has not been fully integerated in the whl file. Could you please tell the reason, or share the correct process to build a “manywheel” file?

@YuxiaoXu if you kept these particular env variables intact: then there’s no reason it will take 3 minutes startup time in 2nd machine.
The only situation I can think of, where that will happen is this:

  • You build wheel with CUDA 8
  • 2nd machine has Volta GPU

In this case, it will runtime-compile some CUDA kernels for Volta (because Volta needs atleast CUDA9 for direct support)

1 Like