I am trying to write a pass that automates the application of a per-module parameter change across a complete model. An example of this is say I get a model as input and I wrap all the linear and conv layers with weight_norm if they dont have it already. Reading through the docs I could think of two ways to do it:
function to apply weight_norm on a module if it hasn’t already
def applyWeightNorm(module):
if re.findall(“Conv2d”, type(module)) and hasattr(module, “weight”):
weight_norm(module)
#end applyWeightNorm
Two ways to apply weight_norm to a module
model.apply(applyWeightNorm)
or
for k,v in model.named_modules():applyWeightNorm(v)
Does this way of doing it makes sense? Is one of these ways better than the other?
I am running in a multi-gpu distributed mode if that matters.