Assuming my model is on a gpu already, is there a way to get a state_dict of a model with cpu tensors without moving the model first to cpu and then back to gpu again?
Something like:
state_dict = model.state_dict()
state_dict = state_dict.cpu()
Assuming my model is on a gpu already, is there a way to get a state_dict of a model with cpu tensors without moving the model first to cpu and then back to gpu again?
Something like:
state_dict = model.state_dict()
state_dict = state_dict.cpu()
for k, v in state_dict.items():
state_dict[k] = v.cpu()
There is also a one-liner to create a cpu
copy of the state_dict
:
{k: v.cpu() for k, v in model.state_dict()}
Small correction: to iterate through, you need the items()
from the dict, not the dict itself
{k: v.cpu() for k, v in model.state_dict().items()}