Model object has no attribute 'load_state_dict'

werdas34 · September 28, 2022, 11:41am

Hello,

I want to load a model from a file and determine the model parameters.

  model_path = os.path.normpath(args.model)
  model = model_path.split(os.sep)[-1]

  splitted = model.split("_")
  splitted = splitted[0:-3]
  section = "_".join(splitted)

  model = CorefModel(args.config_file, section) # <- this works
  model.load_state_dict(torch.load(args.model)) # <- that not

  total_params = sum(	param.numel() for param in model.parameters() )
  print("The model " + section + " has a parameter weight of " + total_params)

I get the Error AttributeError: ‘CorefModel’ object has no attribute ‘load_state_dict’

The save-Method:

  def save_weights(self):
      """ Saves trainable models as state dicts. """
      to_save: List[Tuple[str, Any]] = \
          [(key, value) for key, value in self.trainable.items()
           if self.config.bert_finetune or key != "bert"]
      to_save.extend(self.optimizers.items())
      to_save.extend(self.schedulers.items())

      if self.epochs_trained == self.config.train_epochs:
          time = datetime.strftime(datetime.now(), "%Y.%m.%d_%H.%M")
          path = os.path.join(self.config.data_dir,
                              f"{self.config.section}"
                              f"_(e{self.epochs_trained}_{time}).pt")
          savedict = {name: module.state_dict() for name, module in to_save}
          savedict["epochs_trained"] = self.epochs_trained  # type: ignore
          torch.save(savedict, path)

Do I need to implement load_state_dict myself? Or does this only work on a gpu? Currently I am testing it on a CPU?

ptrblck · September 28, 2022, 3:24pm

Could you explain how CorefModel is defined and post its class definition?

werdas34 · September 28, 2022, 6:57pm

build_model() and at the top is the CorefModel class.

ptrblck · September 28, 2022, 11:49pm

Thanks for the link. Based on the code it’s not derived from nn.Module but a custom class, which doesn’t provide the load_state_dict method. You might want to use the load_weights function instead, which is also a custom method of this class.

werdas34 · September 29, 2022, 10:26am

I added a return state_dicts in the load_weights method to access the dict. But I can’t apply load_state_dict to the dict and in load_weights return self.load_dict(state_dicts) doesn’t work.

def load_weights():
     # added below
     return state_dict # works but..
     return self.load_state_dict(state_dict) # does not work

# in my file
model = CorefModel(args.config_file, section)
model = model.load_weights(args.model) # cannot use load_state_dict

# and state_dict (the value) has no has no parameters attribute. Missing load_state_dict()

Is there a workaround?

ptrblck · September 29, 2022, 3:41pm

That’s expected since your custom CorefModel class does not define a load_state_dict method and you would need to call load_state_dict on the actual model which is derived from nn.Module not your custom class.

werdas34 · September 29, 2022, 4:24pm

Ok I see. So the easy way does not work.
And how do I get the model parameters? Do I have to run the dict myself and add up the values?
Or do I have to return self.optimizers, self.schedulers and self.trainable and then apply model.parameters() to them?