Save optimizer states on multi-gpu?

Hi, I have a question about saving the optimizer states on multi-gpu training scenario:

checkpoint = {
         'model': model.module.state_dict(),
         'optimizer': optimizer.module.state_dict()}

The code above does not work. If I just save like single gpu:

checkpoint = {
         'model': model.module.state_dict(),
         'optimizer': optimizer.state_dict()}

Is that accurate?

The second method should work fine, as the optimizer should get the parameters of the non-parallelized model.

1 Like