Did any one know how to load the pth pre-trained model from fastai to pytorch?

Did any one know how to load the pth pre-trained model from fastai to pytorch?

You would need the model definition, create an instance and use model.load_state_dict.
I’m not sure, which model you would like to load and how it is defined, so feel free to share some more information. :wink:

Thank You for replying, I was using the resnet 34 from fastai export a pre-trained model:

pth trained file

The notebook I trained and created the "stage-2.pth file’
learn = cnn_learner(data, models.resnet34, metrics=error_rate)
learn.save(‘stage-2’, return_path= True)

I want to load this pre-trained pth file for feature extraction for gan
How do I defind a model definition that the model I used was from fastai that I don’t even know it?

I’m not deeply familiar with the cnn_learner class, but based on this line of code it should contain the model instance as an attribute, which might be used to save the state_dict directly.

Based on your code the actual model might just be torchvision.models.resnet34.
You could try to create an instance of this model and load the state_dict.

Hi, thank you for answering me. But, your way does not work. It looks like once it is using learn.module, it will fail to correctly predict the true value, even though the original fastai model has high accuracy.

[https://drive.google.com/drive/u/0/folders/1vGgwkDCDMb5L8z6SgE1LDor6PhBeGmuW](http://pre-trained model)
data_set

In [304]:
print(np.argmax((model_pkl(intput.cuda() )).cpu().detach().numpy()   ))
print(((model_pkl(intput.cuda() )).cpu().detach().numpy()   ))
0
[[ 2.419829 -0.988511 -0.085516 -0.889875]]
In [0]:
learn_pth = learn.load('/content/drive/My Drive/fastai-v3/data/Trained Happy Sugar Life/stage-2').model;
learn_pth.eval()
In [303]:
print(np.argmax((learn_pth(intput.cuda() )).cpu().detach().numpy()   ))
print(((learn_pth(intput.cuda() )).cpu().detach().numpy()   ))
0
[[ 2.419829 -0.988511 -0.085516 -0.889875]]
In [301]:
pred_class,pred_idx,outputs = learn_pkl.predict(img_it)
print(pred_class)
print(pred_idx)
print(outputs)
shouko
tensor(3)
tensor([2.0889e-05, 3.8629e-11, 2.9075e-07, 9.9998e-01])
In [305]:
pred_class,pred_idx,outputs = learn.predict(img_it)
print(pred_class)
print(pred_idx)
print(outputs)
shouko
tensor(3)
tensor([2.0889e-05, 3.8629e-11, 2.9075e-07, 9.9998e-01])

The last two tests are testing loading pth and pkl and their output value. Their output value are same, but they are so different when they are using the model instance attribute. As a result, they made an wrong prediction(correct index is 3). I don’t know. The Resnet that fastai using is totally same as the Pytorch offical’s.

Link to notebook

I would recommend to check, what learn.predict is doing and try to match the behavior in your “manual approach” . I.e. is it normalizing the inputs or processing in some other way?

Also, did you compare the parameters and made sure they are equal after loading the model?

Thank You for replying. I don’t think the parameters arechanged after the learner loaded the pth file because its accuarcy is high after loaded the pth file.

I just hope that you are right that the learn.model has the learned weight and it is the model itself. Thus, how I check the learn.model’s parameters and layer dict key to compare the learn object’s own parameters and layer dict key? Please take your time if you are busy.

Are you loading a state_dict object or a custom checkpoint implementation of fastai?
I’m not deeply familiar with the source code of fastai, so @jphoward might give some insight on how to get the underlying model.

I won’t brother him because I think he is busy.

In addition, I think you have helped me enough.
Thank You very much.

I will try to re-implement the learn.predict function because I guess there it has a magic or whatever.

You can see the source code for save here:

(@ptrblck you linked to v0.7 - you can see the ‘old’ in the path to it; v1 is in ‘fastai’, not ‘old’)

As you can see from the source, if with_opt is false, then the saved file is just the model state_dict; if it’s true, it’s a dictionary with the state_dict and the optimizer state.

3 Likes

Hi jemery. The reason why learn.module(input_img) can not predict as well as learn.prrdict was because I didn’t load the opt state dict? Or after I set eval() and require grad is false just like lesson7 notebook, learn.module(input_img) will predict as well as learn.predict? Thank you

Thank you, this made it work for me:


get_model_structure() gives you an untrained model


Train it

model = get_model_structure()
learn = Learner(data, model)
learn.fit_one_cycle(64)
learn.save(‘name’, with_opt=False)


Load it for evaluation

model = get_model_structure()
state = torch.load(‘path to saved/name.pth’)
model.load_state_dict(state)
model.eval()

1 Like