I trained my module with 4 gpus ,like:
mymodule = nn.parallel.DistributedDataParallel(mymodule, device_ids=[local_rank])
But I saved my module by
torch.save(mymodule.state_dict() , '%s/modelG_%d.pth' % (opt.outf, epoch))
When I load it by:
mymodule.load_state_dict(torch.load(f))
I got:
RuntimeError: storage has wrong size: expected -4763383137013773690 got 128
What is wrong? And if there is any way to deal with it without re-training?
Thanks.