What is the point of map_location argument in torh.load function if model is still on the different device after model.load_state_dict?

AndreyBocharnikov · June 11, 2021, 10:28am

Hello everyone.
I am witnessing strange and imo wrong behavior, can somebody please explain to me why this is happening?
I loaded my pretrained model via
checkpoint = torch.load(params.pretrained_model, map_location=torch.device(cuda:0)) model.load_state_dict(checkpoint['model_state_dict'])
And while all Tensors in checkpoint['model_state_dict'] have device type equals to cuda:0, all parameters in the model have device type equals to cpu. Why does it happen so that if I want to load a specific Tensor as a parameter, its value is being loaded while its device type not? Does it mean that an extra copy of each parameter is being created on a different device?