Why does a `.pth` file take so much larger space? (I am missing something)


>>> net = nn.Conv2d(3, 3, 1, 1, 0)


>>> sys.getsizeof(net.weight.data)

which is in bytes.


>>> torch.save(net.state_dict(), './net.pth')
>>> !du -h ./net.pth
4.0K    ./net.pth

which if converted to bytes

>>> 4 * 1000

is 4000 bytes!!!

Please, tell me what I am missing and why is there is a massive increase in size when it is stored in a file?

P.S. Its quite obvious I am missing something obvious, just point to some article or something, or maybe in the PyTorch Codebase where this is implemented.

Okay, My second line is wrong.

>>> sys.getsizeof(net.weight.data)

getsizeof can’t handle torch.Tensor
it returns 72 in case of empty tensors also!

So, I did this

>>> t1 = torch.Tensor([5.])
>>> sys.getsizeof(t1.item())

Okay, looks good.
But in that case the model will contain

>>> 24 * torch.numel(net.weight.data)

I am closer to 4000 bytes! But still far away!
My guess is that each parameter uses far more bytes than its showing up.
I am doing something wrong most likely.


Few things I think:

  • du -h actually counts the size of the folder as well. Running “ls -lha” will show you that the empty folder takes 4K, not the .pth file which is only 515bytes.
  • sys.getsizeof() measure the size of the Python object. So it is very unreliable for most pytorch elements.
  • The storage format is not really optimized for space. You can actually “cat” the file and you’ll see that it contains more strings than actual data.
  • Using different pickle backend might improve that but I have never tried (pickle_module argument to the save function)

Okay, I didn’t knew that about du -h, now that you say so, I checked it though ls -lha and GUI and it shows to be of 5xx bytes, which is goood :smile: Thanks!
Maybe I will try changing the pickle_module once and play around with it a bit.

My main aim was to calculate what the size of a model is and so, getting the amount of memory it will take when I move it to GPU.
It should take around the same memory, right? (the size of the model as calculated based on the datatype of the model and the memory it consumes in the GPU when I move it to GPU? :thinking:) Should be, intuitively.

Please feel free to reply when free, I have no hurries.
Thanks a ton for getting back to me :slight_smile: Really Appreciated :smile:


The size it will take on gpu is the size of all the tensors. So the following will give you the memory in Bytes:

size = 0
for p in model.parameters():
  size += p.nelement() * p.element_size()