[SOLVED] Loading cuda()'ed state_dict in CUDA_VISIBLE_DEVICES=-1 raises RuntimeError

Is it intended behavior? Doesn’t it just have backportability? It’s possible to load cpu()'ed model’s state_dict using cuda()'ed model’s load_state_dict(), but backward is not, when cuda is not available (CUDA_VISIBLE_DEVICES=-1). I read torch.load() doc, and I understand that there may be some cuda-device-related information in state_dict. So I think that would be good nn.Module.state_dict() to have an option to convert all output tensors to cpu tensors when saving.

EDIT:
I tried cpu()-save()-cuda() way to resolve this problem, but I just found out that optimizer’s interface doesn’t have cpu() or cuda(). Then how can I resolve this?

My problem is, I just want to check if my model is learning correctly, but I’m getting out of memory error. CUDA_VISIBLE_DEVICES=-1 resolves OOM error, but raised another error.

If you want to load a model trained using cuda to run on cpu you just load as:

saved_state = torch.load("saved_model_dict_file", map_location=lambda storage, loc: storage)
model.load_state_dict(saved_state)
3 Likes

It works. Thanks very much!

1 Like

Glad to hear it. :grin: