Loading weights for CPU model while trained on GPU

jtremblay · March 14, 2017, 12:04am

This is not a very complicated issue, but I am not sure what is the best way to load the weights into the cpu when the model was trained on a GPU, thus here is my solution:

model = torch.load('mymodel')
self.model = model.cpu().double()

I am not sure if this should be a bug, also this discussion is related link.

fmassa · March 14, 2017, 12:50am

There is an option in torch.load to map the parameters from one specified device to another.
As pointed out in the link you send, there is a way of forcing all GPU tensors to be in CPU while loading, which I copy here:

torch.load('my_file.pt', map_location=lambda storage, loc: storage)

achaiah · April 26, 2017, 4:11pm

Quick question… this only seems to work when the model is trained on one GPU. If I train my model on multiple GPUs, save it, and then try to load on the CPU I get this error: KeyError: 'unexpected key "module.conv1.weight" in state_dict' Is there something different that needs to happen during saving/loading for multiple GPUs?

fmassa · April 26, 2017, 6:16pm

This is a different issue, and is related to [solved] KeyError: 'unexpected key "module.encoder.embedding.weight" in state_dict'

achaiah · April 26, 2017, 10:08pm

Perfect, thanks for the clarification. I ended up using the conversion suggested in your linked post.

jadore801120 · July 11, 2017, 2:47pm

Thanks for the suggestion!

tralgu · July 28, 2017, 6:37am

Wait, I don’t get it…
So in the lambda:

torch.load('my_file.pt', map_location=lambda storage, loc: storage)

storage and loc are supposed to be replaced by the variable I want to store the value in and the target location (CPU or GPU number) respectively, right? But how do I specify the location? Are there any ‘keywords’ to do so?

Peter_Ham · October 2, 2017, 5:58am

I have a related question, I have a shared model trained on GPU, and another process needs this model for inference on CPU. So I use a shared model and use the following command to load this shared model

    cpu_model.load_state_dict(gpu_model.cpu().state_dict())

however, this won’t work and returns CUDA error(3), initialization error, what happened?

Marat · January 29, 2018, 1:18pm

If I load model with your hack and set model.train(‘True’) and trying even inference it fails. So it do not work for all cases (finetune on CPU after training on GPU not working).

TSLsun · November 7, 2019, 2:02pm

Not sure about your situation.
However, maybe try the following code, which works for me:

use_cuda = torch.cuda.is_available()
DEVICE = torch.device('cuda' if use_cuda else 'cpu')   # 'cpu' in this case

cpu_model = your_model()
cpu_model.load_state_dict(torch.load(path_to_your_saved_gpu_model, map_location=DEVICE)

A_Rza_SH · February 24, 2021, 1:40pm

thanks, worked for me

prashanth · November 16, 2022, 4:57pm

sir in this post I didn’t understand what to place in the place of your_model() can you please show it using example .
thank you waiting for the response…