Reducing docker size with PyTorch model

I’d like to deploy four of my models with a total size of ~100mb when the state saved on disk. It would be great if the docker could take as small space as possible, no more than 700 mb.

Now I’m creating docker and install a few dependencies. Suddenly it takes 2.8 GB on disk, where PyTorch and related libraries take at least 800 MB in conda.

Therefore I’m looking for a simple way to deploy these models. As far as I understand I could use jit and e able to run models with small library libtorch. However, I’d like to run the model in Python, not C++. Is it possible?

And the second thing I don’t understand is whether I should use onnx or jit? What is the difference?

I’m looking for a simple guide with the steps to do that: export with jit and load with some lib I don’t know in Python? Or export to ONNX and then to TensorFlow? Or maybe get rid of conda and just install some PyTorch version and it should be no more than 80 MB on disk?

Any help appreciated!

if you are deploying to a CPU inference, instead of GPU-based, then you can save a lot of space by installing PyTorch with CPU-only capabilities. That significantly reduces the docker image size (the pytorch component is ~128MB compressed.

To install CPU-only, go to and in the Install selector, select the option CUDA to be None and you will get the right set of commands.

@smth hank you for the help!
unfortunately, having empty virtualenv of size 4.8 mb, after the command:

pip install
or command
pip install torch

venv grows to 375 mb.

I’m using Python 3.7 and macOS Catalina 10.15.4

My lib catalog in venv looks like this:

16M caffe2
4,0K easy-install.pth
3,0M future
56K future-0.18.2.dist-info
292K libfuturize
236K libpasteurize
86M numpy
132K numpy-1.18.3.dist-info
116K past
7,3M pip-19.0.3-py3.7.egg
560K setuptools-40.8.0-py3.7.egg
4,0K setuptools.pth
256M torch
352K torch-1.5.0.dist-info

correct, that is uncompressed. when you compress it, it should be about half of that.

I’m not familiar with any way of using it as compressed package. I mean, when I want to infer, I need to uncompress it anyway?

I assume the best way is to convert model and use it with torchlib or export it to onnx and try there.