How to use torch.save() when the model architecture is changing?

gebrahimi · July 13, 2020, 3:11pm

I have asked this before and did not get an answer. So asking again to see if someone has an idea about it.

I am working on a project that prunes neurons (i.e. the architecture of the network is consistently changing by removing some neurons from layers). I need to save models during training which have different architectures. As far as I know, i have two options:

First option:

torch.save(net.state_dict(), PATH)
model = Net()
model.load_state_dict(torch.load(PATH))

Second option

torch.save(net, PATH)

I am currently using second option, however it is saving a lot of stuff somehow and although the memory it should take for networks like alexnet and resnet should not be more than 100 mb, my models take 10-15 GB. In other words it is saving some other tensors which i don’t know what they are.

I would like to use the first option, however since i am pruning the network and the architecture is changing constantly, i will not be able to load it back unless i know the architecture and also have a code that make that arch for me which is not convenient.

Now my question is that, is there a way to use the second option while i also save the model architecture so i can load them back? I would appreciate any help. Thanks.

Michela · July 14, 2020, 5:42am

Are you using torch.nn.utils.prune to prune your models?

gebrahimi · July 14, 2020, 1:20pm

@Michela No, I have my own algo to prune.