Effect of calling model.cuda() after constructing an optimizer

SimonW · March 19, 2018, 7:02pm

It is fine in case of SGD. However, if the optimizer constructs some buffer in __init__ basing on the parameter type, then you will have some problem, e.g. https://github.com/pytorch/pytorch/blob/master/torch/optim/adagrad.py#L30