To let a non-DDP model load a state dict from a DDP model, consume_prefix_in_state_dict_if_present() needs to be applied to strip the prefix “module.” in the DDP state dict before loading.
new_state_dict = collections.OrderedDict()
for k, v in checkpoint['model_dict'].items():
name = k.replace("module.", '') # remove `module.`
new_state_dict[name] = v
but I would prefer to use the consume_prefix_in_state_dict_if_present.
Can someone elucidate the correct usage of this please? Obviously, I am not getting it!
Hi @John_J_Watson, sorry for the confusion. consume_prefix_in_state_dict_if_present removes the prefix in place rather than returns any value. You just use checkpoint['model_dict'] instead of creating a temporary variable here.
thank you @wayi for this. I dont know why I didnt think to try thinking it could be inplace!
Just as a follow up quetsion, this wrapper basically does the same as follows?
new_state_dict = collections.OrderedDict()
for k, v in checkpoint['model_dict'].items():
name = k.replace("module.", '') # remove `module.`
new_state_dict[name] = v
Is that right? Are there any advantages/disadvantages of using either approaches?
One subtle difference is that “_metadata” field (if any) is handled separately. See:
Other than that, I don’t think there is a big difference. Your own implementation is a little less memory efficient I will say, as you don’t do it in place.