From the PyTorch 1.4.0 docs in the torch.optim page:
torch.optim
.to('cuda')
Yes, to('cuda') and .cuda() are equivalent calls. Make sure to push the model to the appropriate device before creating the optimizer.
to('cuda')
.cuda()