Why are saved models so large?

sen1 · November 18, 2022, 1:37pm

I have implemented a neural network with an LSTM model (see below). After training the model with a hidden size of 512, I saved it by calling torch.save(model).

I expected the model size to measure in the low tens of kilobytes, accounting for three layers of LSTM’s hidden parameters. I also enumerated the parameters via model.named_parameters(). I was surprised to find that the actual unzipped size of the parameters is more like 30 MB.

Why is the saved model so large, and what can I do to cut down its size?

class IRNN(nn.Module):
    def __init__(self, obs_size, hidden_size):
        super(IRNN, self).__init__()

        self.num_layers = 3
        input_size = obs_size
        self.hidden_size = hidden_size
        self.hidden = self.init_hidden()
        self.hiddenc = self.init_hidden()
        self.rnn = nn.LSTM(
            input_size=input_size, hidden_size=hidden_size, num_layers=self.num_layers
        )
        self.i2o = nn.Linear(hidden_size, 256)

ptrblck · November 18, 2022, 7:14pm

That’s strange as I’m seeing an expected parameter size of 56MB using:

class IRNN(nn.Module):
    def __init__(self, obs_size, hidden_size):
        super(IRNN, self).__init__()

        self.num_layers = 3
        input_size = obs_size
        self.hidden_size = hidden_size
        self.rnn = nn.LSTM(
            input_size=input_size, hidden_size=hidden_size, num_layers=self.num_layers
        )
        self.i2o = nn.Linear(hidden_size, 256)
        
        
model = IRNN(512, 512)
torch.save(model.state_dict(), "tmp.pt")

size = sum([param.nelement() * param.element_size() for p in model.parameters()])
print("{} parameters".format(sum([param.nelement() for p in model.parameters()])))
# 14680064 parameters
print("size: {}MB".format(size / 1024**2))
# size: 56.0MB

Note that I had to remove the self.hidden(c) attributes, as self.init_hidden is undefined, but this would even lower my size estimation and yours should show a larger size.

On my system the state_dict is ~26MB when saved to disk, which sounds alright given a ~2x size reduction.