Do I have to use the same GPU for saved state dict?


So I did the training of my neural network, say on GPU #3 and the state dict is saved. Do I have to use GPU #3 every time when I want to load my state dict?

Also, will it work if I don’t put my model to cuda before loading the state dict into the model?

Thank you.

You can use the map_location argument in torch.load to map your GPU tensors to the CPU as explained in the note section.
Alternatively you could of course push the model to the CPU before saving its state_dict.

Say if I want to save the state_dict for every epoch, do I have to push the model back to GPU after saving the state_dict on CPU?

Thank you

Yes, the model would have to be copied to the GPU again.
However, depending on the length of your epoch and how often you save the model (e.g. only the best model using the validation accuracy), the latency might be negligible.