Question about how PyTorch handles Regularization

DuttaAbhigyan · June 22, 2019, 10:40pm

So lets say I have a regularisation term which varies with input. But since PyTorch does things in a batch how to communicate to PyTorch we want the regularisation for a single example and not between all entries of the matrix.

For example, if we have an input dependent co-adaptation regularizer but now if we pass this as a scalar over batch_size * (2D reg_cost__matrix), how do I ensure that the co-adaptation is reduced in the 2D matrix (i.e intra matrix optimisation) and not between the 2D matrices of the batch size (i.e inter matrix optimisation).