Best practises for "turning off" parts of the model?

Hi all,

I’m trying to do some ablation studies (or sensitivity studies in some other jargon), turning on and off different components of my model.

For now I’ve assumed that to avoid passing the input through a certain layer (say an nn.Embedding) it would be sufficient to avoid gradient contributions from that component of the model.

That will evidently make my model parameters larger than necessarily, with some of the layers not being updated past their initialisation.

Something tells me this is not exactly the best way to do these things. What would be your take? Would you rather define a specific model just for that component you want to turn on/off?

You could do something like

class Model(nn.Module):
    def init(self, args):
        self.m1 = Module1()
        if args.use_my_second_layer:
             self.m2 = Module2()
    def forward(self, x):
        h = self.m1(x)
        if args.use_my_second_layer:
             h = self.m2(x)
        return h

Yeah, I’ve tried something similar. But I personally think this can make your model become incredibly bloated.

I think its “easier” to comprehend something like

class Model(nn.Module):

    def __init__(self,args) -> None:
        super(Model).__init__()

        self.args = args

        self.m1 = Module1()

        self.m2 = Module2()

    def forward(x):
        
        match self.args.logic:
            case 'm1':
                x = self.m1(x)
            case 'm2':
                x = self.m2(x)
            case 'm1m2':
                x = self.m1(x)
                x = self.m2(x)
                
        return x

But I’m also thinking the above is not best practise either lol