I have trained a straighforward network and I have saved their weights. Then I wanted to train an autoencoder using the weights of the previous trained model, training only the decoder part. I have specificied that I do not want gradients in the encoder part in order to not optimize their parameters. Also I have specified in the optimizer that I only want to optimize the parameters of the decoder part. I do not get any error, but it is not working well because when I compare the two dictionaries of weights of the encoder and the pretrained model, they are not the same. I have attached the models I am using for: the pretrained model is “Class1”, and the autoencoder is “Class2”(the encoder part is self.layers).This two classes are in a file named models.py.
My code is a bit long and I am using 70000 images as dataset, but the parts in the code in which I save and load the model is what I have attached before. For example if I print “model.layers.state_dict()[‘4.bias’]” in FFN() I get “tensor([-0.0336, -0.0421])”. But if I print the same in the Auto_encod_trained I get “tensor([0.0705, 0.0294])”, and they should be the same. I basically want to know if there is something more to take into account in order to fix the weights of the encoder part during the training with the weights of the pretained model (in my case FFN()). Thanks!!
Okey, thanks!!. I think the problem is when I read the state_dict() of model a. Because I have the two models in different scripts I have saved “model a” using torch.save() as this:
No,I don’t think it’s the format and PyTorch should raise an error is keys are missing or mismatching while loading the state_dict. Let me know if you have a code snippet to reproduce the issue.