Why would you have to train the model again if you save baseline
’s state_dict
? You would be able to create the baseline
model afterwards and load its state_dict
, wouldn’t you?
Thanks for your reply again! I might didn’t explain it clearly.
I save the baseline through this function in train.py:
torch.save(
{
'model': get_inner_model(model).state_dict(),
'optimizer': optimizer.state_dict(),
'rng_state': torch.get_rng_state(),
'cuda_rng_state': torch.cuda.get_rng_state_all(),
'baseline': baseline.state_dict()
},
os.path.join(opts.save_dir, 'epoch-{}.pt'.format(epoch))
As you can see, I save the baseline through baseline.state_dict()
but the state_dict()
here is defined by the baseline class itself which is:
def state_dict(self):
return {
'model': self.model,
'dataset': self.dataset,
'epoch': self.epoch
}
Therefore the “pt” document I have now include a whole model(the baseline part ) and the parameters (the model part). When I try to evaluate the model, I load the model part through:
load_data = torch_load_cpu(model_filename)
def torch_load_cpu(load_path):
return torch.load(load_path, map_location=lambda storage, loc: storage)
model.load_state_dict({**model.state_dict(), **load_data.get('model', {})})
But when it runs to torch_load_cpu()
, because of the baseline whole model part, I get the error: “RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.”
So I was wonder if there is a way to only load the model part.
Why would you have to train the model again if you save
baseline
’sstate_dict
? You would be able to create thebaseline
model afterwards and load itsstate_dict
, wouldn’t you?
Right now the ‘pt’ document I have only has baseline
’s whole model.