How to specify only single module in the Per-parameter options?

I have a very nested model with many Prelu. It is not possible to specify all the parameters in the optimizer as a default and respecify all the Prelu as the specific condition.

I want to apply zero weight decay to all Prelu parameters and keep all the rest default.

it says in the documentation

You can still pass options as keyword arguments. They will be used as defaults, in the groups that didn’t override them. This is useful when you only want to vary a single option while keeping all others consistent between parameter groups.

Which means I can only pass the parameters that I want to make them specific and keep the rest following the main conditions. how this can be applied?

Just like for the lr in the snippet below the note you cite, you pass weight_decay globally set to what you need and add a weight_decay : 0.0 entry to the dictionary for the param group with prelu parameters.

model = torch.nn.Sequential(torch.nn.Linear(3, 4), torch.nn.PReLU())

prelus = {n for n, m in model.named_modules() if isinstance(m, torch.nn.PReLU)}

prelu_param_names = {n for n, _ in model.named_parameters() if n.rsplit('.', 1)[0] in prelus}

torch.optim.SGD([
                {'params': [p for n, p in model.named_parameters() if n not in prelu_param_names]},
                {'params': [p for n, p in model.named_parameters() if n in prelu_param_names], 'weight_decay': 0.0}
            ], lr=1e-2, momentum=0.9, weight_decay=1e-5)

I’m sure you could make the filtering more terse, but it would seem that names are something easy to debug if things go wrong.

Best regards

Thomas

2 Likes