Replace diagonal elements with vector

gelazari · March 22, 2018, 1:44pm

I have been searching everywhere for something equivalent to the following in PyTorch, but I cannot find anything.

L_1 = np.tril(np.random.normal(scale=1., size=(D, D)), k=0)

L_1[np.diag_indices_from(L_1)] = np.exp(np.diagonal(L_1))

I guess there is no way to replace the diagonal elements in such an elegant way using Pytorch.

SimonW · March 22, 2018, 2:12pm

# A has size k * k
k = A.size(0)
A.as_strided([k], [k + 1]).copy_(vector)

gelazari · March 22, 2018, 2:21pm

Thanks @SimonW. I guess it works in a similar fashion in case of Variables, i.e. my_variable.data()…

SimonW · March 22, 2018, 3:18pm

Please just directly operate on Variable rather than var.data if you want to trace history (and do backward etc).

gelazari · March 22, 2018, 4:21pm

@SimonW I’m getting the error “AttributeError: ‘Variable’ object has no attribute ‘as_strided’” although I have the latest PyTorch version. What is the problem?

SimonW · March 22, 2018, 8:14pm

oh i see… Yeah, it might not be available in 0.3.1. Before the next release, you can try advanced indexing A[[1,2,3], [1,2,3]]=4 but that might break backward if some graph depends on the original overwritten values. Or you can multiply by a matrix then add another matrix… Yeah, these are not ideal…

gelazari · April 23, 2018, 9:56am

So, is there any efficient way to do the replacement in PyTorch?

albanD · April 23, 2018, 10:06am

Hi,

The solution with advances indexing is the way to go for 0.3.1 I think.
Keep in mind that inplace operations are not always possible when working with Variables because the original value might be needed to compute the backward pass.

gelazari · April 23, 2018, 11:00am

The thing is that my torch version is ‘0.3.1.post2’ and I do not seem to have the above mentioned functions.

albanD · April 23, 2018, 11:58am

You should be able to do: A[[1,2,3], [1,2,3]]=4 or A[range(size), range(size)] = your_diag_value.

gelazari · April 23, 2018, 1:55pm

Will it work if the “diagonal_value” is a vector though?

albanD · April 23, 2018, 2:01pm

Yes it will work if your_diag_value is a 1D tensor.

gelazari · April 23, 2018, 2:14pm

ok thank you. This will not affect the backward pass right?

albanD · April 23, 2018, 2:34pm

It will affect it because the diagonal values of the original matrix are not used to compute the output anymore.
See the code sample below:

import torch
from torch.autograd import Variable

size = 10
full = Variable(torch.rand(size, size), requires_grad=True)
new_diag = Variable(torch.rand(size), requires_grad=True)

# Do this because we cannot change a leaf variable inplace
full_clone = full.clone()

full_clone[range(size), range(size)] = new_diag

full_clone.sum().backward()

# Should be a diagonal of 0 and 1 for the rest
print(full.grad)
# Should be full of 1
print(new_diag.grad)

gelazari · April 23, 2018, 2:37pm

So, there is no way to change the diagonal values and keep the backward pass unaffected? Seems strange.

albanD · April 23, 2018, 2:39pm

Why do you want to do that? The gradients that you compute will be completely wrong then…

gelazari · April 23, 2018, 2:44pm

I want to force the diagonal of my covariance matrix to be positive and differentiable wrt the backward pass.

albanD · April 23, 2018, 2:49pm

Ok, not sure that makes sense…
But here is the code to do it

import torch
from torch.autograd import Variable

size = 10
full = Variable(torch.rand(size, size), requires_grad=True)
new_diag = Variable(torch.rand(size), requires_grad=True)

# Do this because we cannot change a leaf variable inplace
full_clone = full.clone()

# WARNING: using data here will break the graph and this
# operation will not be tracked by the autograd engine.
# Hence giving "wrong" gradients
full_clone.data[range(size), range(size)] = new_diag.data

full_clone.sum().backward()

# Should be full of 1
print(full.grad)
# Should be None (equivalent to full of 0)
print(new_diag.grad)

gelazari · April 23, 2018, 2:59pm

I was doing something like this:

    self.L_1 = Parameter(torch.randn(dim, dim), requires_grad=True)
    self.L_1.data = torch.tril(self.L_1.data)
    self.log_diag = Parameter(torch.diag(self.L_1.data), requires_grad=True)
    self.log_diag.data = torch.exp(self.log_diag.data)
    self.mask = Parameter(torch.diag(torch.ones_like(self.log_diag.data)))
    self.L = Parameter(self.mask.data * torch.diag(self.log_diag.data) + (1. - self.mask.data) * self.L_1.data, requires_grad=True).cuda()

SimonW · April 23, 2018, 6:39pm

If the backward doesn’t need content of that cov matrix you have, then just modifying inplace is fine. (Run .backward to find out.) Otherwise, you can do a clone and then modify inplace.