I have asked this before and did not get an answer. So asking again to see if someone has an idea about it.
I am working on a project that prunes neurons (i.e. the architecture of the network is consistently changing by removing some neurons from layers). I need to save models during training which have different architectures. As far as I know, i have two options:
First option:
torch.save(net.state_dict(), PATH)
model = Net()
model.load_state_dict(torch.load(PATH))
Second option
torch.save(net, PATH)
I am currently using second option, however it is saving a lot of stuff somehow and although the memory it should take for networks like alexnet and resnet should not be more than 100 mb, my models take 10-15 GB. In other words it is saving some other tensors which i don’t know what they are.
I would like to use the first option, however since i am pruning the network and the architecture is changing constantly, i will not be able to load it back unless i know the architecture and also have a code that make that arch for me which is not convenient.
Now my question is that, is there a way to use the second option while i also save the model architecture so i can load them back? I would appreciate any help. Thanks.