I’m trying to implement a dropout-esque layer that should only be applied at training time in order to do some batch-wise swap noise augmentation. Unfortunately the details for all of the layers in pytorch that do this are happening at a much lower level so I can’t seem to find an example of how to create a layer that is only applied during training.
Any help or pointers to example layers where this is done would be appreciated.