Smaller torch saved model

I have a trained NTS-NET that uses 108 MB on file storage. In my server I do not have enough space but is only some MB. So I compress “state_dict” using “tar.gz” and I arrive to 100 MB. It i just enought. So to load the model I use the funcion

import pickle
import tarfile

from torch.serialization import _load, _open_zipfile_reader

def torch_load_targz(filep_ath):
    tar =, "r:gz")
    member = tar.getmembers()[0]
    with tar.extractfile(member) as untar:
        with _open_zipfile_reader(untar) as zipfile:
            torch_loaded = _load(zipfile, None, pickle)
    return torch_loaded

if __name__ == '__main__':
    # equivalet for torch.load("../models/")  for .tar.gz 

So at the end I read the torch model from tar.gz directly. But in this way the prediction are too slow.
Exist some better solution at this problem?

(I’m using torch-1.4.0, and python 3.6)

Is the loading of the state_dict slow or the model predictions?
The latter shouldn’t be influenced by how the state_dict is loaded or are you reloading it in every iteration?

The slow part is the extract tar.gz and it save only 8MB into the state_dict stored file.
I’m speaking about only predictions (of small data) so I do not have iterations inside a single call.

It is expected that unzipping a file will take longer than e.g. reading a binary file.
What kind of system are you using that you are running out of memory for 108MB?

No. I fill my server memory with:

  • with torch and torchvision and other libraries
  • and 108MB of trained model

For example I see that transform a tensorflow model using tensorflow-lite the size in MB of the model can be reduced a lot. I was wondering for something like that using pytorch. If exist some other way may be it is less slow.

You could try to quantize your model to reduce the size.
However, I’m not sure, how experimental this feature is at the moment,

Thank you! I try this way! And I wondering also for other.