I have a weight matrix in a neural network and I want to force the diagonal elements to be zero. All the parameters of the weight matrix are independently adjustable, except for the diagonal elements, which should be zero (ie the diagonal is constant and zero, but the other weights are learnable). Later on there may be different regularizations as well but the solution here shouldn’t involve regularization.

I can think of a couple of ways to do this:

-initialize the diagonal elements of the weight matrix to be zero, and then set `requires_grad = False`

for all diagonal elements

-create a constant variable that’s a mask of 1s and 0s and use that to multiply an adjustable variable. Then use this product as the weight matrix

I was wondering what you would recommend as the best approach.