I am training a GAN model right now on multi GPUs using DataParallel, and try to follow the official guidance here for saving
torch.nn.DataParallel Models, as I plan to do evaluation on single GPU later, which means I need to load checkpoints trained on multi GPU to single GPU.
The official guidance indicates that, “to save a
DataParallel model generically, save the
model.module.state_dict() . This way, you have the flexibility to load the model any way you want to any device you want”:
#Save: torch.save(model.module.state_dict(), PATH) #Load: # Load to whatever device you want
And this are my scripts for saving the generator and discriminator respectively:
torch.save(G.module.state_dict(), '%s/%s_module.pth' % (root, join_strings('_', ['G', name_suffix]))) torch.save(D.module.state_dict(), '%s/%s_module.pth' % (root, join_strings('_', ['D', name_suffix])))
However, when it comes to saving the checkpoint, I got error:
Traceback (most recent call last): File "train.py", line 227, in <module> main() File "train.py", line 224, in main run(config) File "train.py", line 206, in run state_dict, config, experiment_name) File "/home/BIGGAN/train_fns.py", line 101, in save_and_sample experiment_name, None, G_ema if config['ema'] else None) File "/home/BIGGAN/utils.py", line 721, in save_weights torch.save(G.module.state_dict(), File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 594, in __getattr__ type(self).__name__, name)) AttributeError: 'Generator' object has no attribute 'module'
But the checkpoints can be saved if I use:
torch.save(G.state_dict(), '%s/%s.pth' % (root, join_strings('_', ['G', name_suffix]))) torch.save(D.state_dict(), '%s/%s.pth' % (root, join_strings('_', ['D', name_suffix])))
I am using pytorch with version ‘1.5.0a0+8f84ded’.
I am not sure if the error has something to do with my pytorch version, or if I have missed something in my scripts.
Just in case, if there is another way around that can allow me to load checkpoints trained on multi GPU to a single GPU, would also be great.
Any guidance and assistance would be greatly appreciated!