Optimizer not loading state dict to correct device

Hey,
I am saving state dict and then loading it for further training. I specify mapping location to cuda while doing torch.load, and it works for model weights, but not for the optimizer. Could you help me?

checkpoint = torch.load(net_path, map_location=torch.device("cuda:0"))
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])  # optimizer state tensors are loaded onto cpu
model.load_state_dict(checkpoint['model_state_dict'])  # weights are loaded into cuda

python 3.10
pytorch 2.0.0

Did you move the parameters to cuda before instantiating the optimizer?
The optimizers will cast and transfer the weights to the type and device of the parameters.

Best regards

Thomas

1 Like

Hi Tom, thank you for your response!
You are right - instantiated the optimizer before loading model’s parameters to cuda. I didn’t know it works that way. Thank you!