It seems it’s better to load/save the state dict of “module” instance in nn.DataParallel, not the nn.DataParallel itself.
But I’m not sure if it’s valid option. Is it recommended way to do so?
model = resnet101()
model = torch.nn.DataParallel(model)
torch.save(model.module.state_dict(), 'state')
model2 = resnet101()
model2 = torch.nn.DataParallel(model2)
model2.module.load_state_dict(torch.load('state'))
I just found that nn.DataParallel works well even if there’s no available GPU.
Maybe it would better using nn.DataParallel all the time with or without GPU?