Error after moving model to cpu and moving it back to cuda during training

Hi,

I am training a model on gpu and during training i move it to cpu to execute some operations on it and move it back to cuda. But right after doing operations on cpu and moving it back to cuda, i can not run the step function to take gradient step.

I get the following error:

self.optimiser.step()
File "/nfs/home/aet4537/.conda/envs/pytorch/lib/python3.6/site-packages/torch/optim/lr_scheduler.py", 
line 51, in wrapper
return wrapped(*args, **kwargs)
File "/nfs/home/aet4537/.conda/envs/pytorch/lib/python3.6/site-packages/torch/optim/sgd.py", line 100, in step
buf.mul_(momentum).add_(1 - dampening, d_p)
RuntimeError: expected device cuda:0 but got device cpu

I saw similiar topic that suggested using this, but it does not work for me:

print(optimizer.state[list(optimizer.state.keys())[0]])

for p in optimizer.state.keys():
param_state = optimizer.state[p]
buf = param_state["momentum_buffer"]
param_state["momentum_buffer"] = buf.cuda()  # move buf to device

print(optimizer.state[list(optimizer.state.keys())[0]])

both of my prints show that the tensor that i print is already on cuda. How can i fix this?

Could you post a code snippet to reproduce this issue, please?