Fill diagonal of matrix with zero

I have a very large n x n tensor and I want to fill its diagonal values to zero, granting backwardness. How can it be done? Currently the solution I have in mind is this

t1 = torch.rand(n, n)
t1 = t1 * (torch.ones(n, n) - torch.eye(n, n))

However if n is large this can potentially require a lot of memory. Is there a simpler differentiable solution, perhaps similar to NumPy np.fill_diagonal function?

3 Likes
>>> t = torch.randn(5, 5)
>>> ind = np.diag_indices(t.shape[0])
>>> t[ind[0], ind[1]] = torch.zeros(t.shape[0])
>>> t
tensor([[ 0.0000, -1.2111,  2.0369, -0.7607,  0.1844],
        [ 1.2293,  0.0000, -1.0472, -0.5150,  0.1518],
        [ 0.1972, -0.3708,  0.0000, -1.2243,  0.1612],
        [-0.1637, -1.2848, -0.1972,  0.0000,  1.3353],
        [ 1.1711,  0.9332,  0.0911, -0.3391,  0.0000]])
4 Likes

replace op is non-differential

t1 = torch.rand(n, n)
t1 = t1 * (1 - torch.eye(n, n))
so what ?

2 Likes

yeah, but replacing them with a constant (zero), the gradients will stay zero, I suppose, isn’t that true?

another solution:
t = torch.randn(n, n)
mask = torch.eye(n, n).byte()
t.masked_fill_(mask, 0)

3 Likes

You may code a simple net, and backward to observe gradients

There’s an easier solution: fill_diagonal_

fill_diagonal_ ( fill_value , wrap=False ) → Tensor

7 Likes

How to do it for a batch of tensors? For tensor with shape N * M * M. Have to set diagonal equal to zero for every M*M matrix. fill_diagonal doesn’t work if N isn’t equal to M.

Break it into smaller problems. Each M x M tensor can be cycled through from [0 to N-1]. Run the function on each layer.

I did that but I wanted a solution which didn’t involve loop/cycle.