I am witnessing strange and imo wrong behavior, can somebody please explain to me why this is happening?
I loaded my pretrained model via
checkpoint = torch.load(params.pretrained_model, map_location=torch.device(cuda:0)) model.load_state_dict(checkpoint['model_state_dict'])
And while all Tensors in
checkpoint['model_state_dict'] have device type equals to
cuda:0, all parameters in the model have device type equals to
cpu. Why does it happen so that if I want to load a specific Tensor as a parameter, its value is being loaded while its device type not? Does it mean that an extra copy of each parameter is being created on a different device?