Should I create optimizer after sending the model to GPU?

abhidipbhattacharyya · October 4, 2021, 8:29am

Does it matter if I create optimizer before or after uploading the model in GPU. For example

model = myModel()
optimizer = torch.optim.Adamax(model.parameters())
model = model.to(device)

vs.

model = myModel()
model = model.to(device)
optimizer = torch.optim.Adamax(model.parameters())

Thanks,
Abhidip

ptrblck · October 4, 2021, 10:05am

It shouldn’t matter, as the optimizer should hold the references to the parameter (even after moving them). However, the “safer” approach would be to move the model to the device first and create the optimizer afterwards.

abhidipbhattacharyya · October 4, 2021, 4:51pm

Thank you. The reason I asked this question is because I am trying to train a model which is very demanding in terms of GPU space. Now if I create the optimizer after moving the model to GPU then it is taking more space in GPU and sometime my GPU is overflowing. But by creating the optimizer before moving model to GPU I am able to save some space in GPU.

ptrblck · October 4, 2021, 7:45pm

Could you post an executable code snippet showing this behavior, please?