How do you create a layer that behaves differently during training and evaluation like dropout?

I’m trying to implement a dropout-esque layer that should only be applied at training time in order to do some batch-wise swap noise augmentation. Unfortunately the details for all of the layers in pytorch that do this are happening at a much lower level so I can’t seem to find an example of how to create a layer that is only applied during training.

Any help or pointers to example layers where this is done would be appreciated.

Do you mean like

def forward(self, x):
        x = training_only_layer(x)

That’s probably what I was looking for, yeah. Thanks! I should have taken a look at what was being inherited from nn.