Fill diagonal of matrix with zero

AreTor · January 19, 2019, 11:40am

I have a very large n x n tensor and I want to fill its diagonal values to zero, granting backwardness. How can it be done? Currently the solution I have in mind is this

t1 = torch.rand(n, n)
t1 = t1 * (torch.ones(n, n) - torch.eye(n, n))

However if n is large this can potentially require a lot of memory. Is there a simpler differentiable solution, perhaps similar to NumPy np.fill_diagonal function?

vmirly1 · January 19, 2019, 5:20pm

>>> t = torch.randn(5, 5)
>>> ind = np.diag_indices(t.shape[0])
>>> t[ind[0], ind[1]] = torch.zeros(t.shape[0])
>>> t
tensor([[ 0.0000, -1.2111,  2.0369, -0.7607,  0.1844],
        [ 1.2293,  0.0000, -1.0472, -0.5150,  0.1518],
        [ 0.1972, -0.3708,  0.0000, -1.2243,  0.1612],
        [-0.1637, -1.2848, -0.1972,  0.0000,  1.3353],
        [ 1.1711,  0.9332,  0.0911, -0.3391,  0.0000]])

Sunshine352 · January 20, 2019, 5:14am

replace op is non-differential

Sunshine352 · January 20, 2019, 5:15am

t1 = torch.rand(n, n)
t1 = t1 * (1 - torch.eye(n, n))
so what ?

vmirly1 · January 20, 2019, 5:16am

yeah, but replacing them with a constant (zero), the gradients will stay zero, I suppose, isn’t that true?

Sunshine352 · January 20, 2019, 5:25am

another solution:
t = torch.randn(n, n)
mask = torch.eye(n, n).byte()
t.masked_fill_(mask, 0)

Sunshine352 · January 20, 2019, 5:27am

You may code a simple net, and backward to observe gradients

redtailedhawk · December 24, 2019, 1:55am

There’s an easier solution: fill_diagonal_

fill_diagonal_ ( fill_value , wrap=False ) → Tensor

Usman_Mahmood · March 3, 2021, 8:26pm

How to do it for a batch of tensors? For tensor with shape N * M * M. Have to set diagonal equal to zero for every M*M matrix. fill_diagonal doesn’t work if N isn’t equal to M.

redtailedhawk · March 3, 2021, 9:25pm

Break it into smaller problems. Each M x M tensor can be cycled through from [0 to N-1]. Run the function on each layer.

Usman_Mahmood · March 4, 2021, 11:09pm

I did that but I wanted a solution which didn’t involve loop/cycle.

Sandro_Hauri · November 1, 2022, 12:18pm

A bit late, but if anyone is interested in the batched version without loop:

t1 = torch.rand(m, n, n)
t2 = t1 * (1 - torch.eye(n, n).repeat(m, 1, 1))