Loading Pretrained Weights

Hacking_Pirate · November 2, 2020, 3:46pm

Hi,
I pretrained a custom model with targets 402. Now I need to transfer those weights to a model with 206 targets. How can I do that? This is what I am doing right now model.load_state_dict(torch.load(f"FOLD{fold}_.pth"), strict=False)
But it is not working, showing a size mismatch error.

Hacking_Pirate · November 2, 2020, 4:43pm

@ptrblck do you have any idea?

googlebot · November 2, 2020, 7:45pm

if you mean a subset of targets, something like that:

d = torch.load(f"FOLD{fold}_.pth")
for suffix in (".weight",".bias"):
  key = PREFIX + suffix
  d[key]=map_402_to_206(d[key])
model.load_state_dict(d)

PREFIX is output linear layer’s key. map_402_to_206 should reflect how you select a subset, simplest case would be a subrange: d[key] = d[key][:206]

if you want to reuse hidden layers on a different set of targets, delete above keys instead and use strict=False

Hacking_Pirate · November 3, 2020, 3:58am

Thanks for your response, I actually want to use for a different set of targets and I am using strict =False but it is throwing a size mismatch error.

Hacking_Pirate · November 3, 2020, 4:29am

    for i in range(len(model_source.layers[:-1])):
        model_dest.layers[i].set_weights(model_source.layers[i].get_weights())
    return model_dest

This is the Keras equivalent of what I want to do.

ptrblck · November 3, 2020, 5:09am

Did you follow @googlebot’s suggestion and deleted the key(s) before using strict=False?
The strict=False argument ignores unexpected or missing keys, not shape mismatch errors.

Hacking_Pirate · November 3, 2020, 5:15am

How can I delete the keys?

ptrblck · November 3, 2020, 9:24am

You can delete key-value pairs from an OrderedDict using del as seen here:

d = OrderedDict()
d['a'] = 1
d['b'] = 2
print(d)
del d['a']
print(d)

However, @googlebot already posted a solution where you would copy some of the pretrained parameters into the state_dict.

Hacking_Pirate · November 3, 2020, 10:12am

Thanks, bur what would PREFIX be?

googlebot · November 3, 2020, 1:48pm

you can just explore python directory, e.g. list(d.keys()), it will contain [sub]module names as assigned in __init__, and deleting is something like: del d[“output_layer.weight”]