Variable update after autograd in torch.cat, torch.vstack, nn.ZeroPad2d, torch.index_select and torch.roll

Hi,

let x=torch.vstack((a,b)) and a is being optimized using autograd. How can we update the new value of x without using x=torch.vstack((a,b)) again, i.e. the new value of a will be updated automatically in x? (using shared memory or something like this)

Same question for nn.ZeroPad2d, torch.index_select, torch.cat, torch.roll and in-place operation

PS: e.g. torch.flatten updates the values properly

Here an example:

import torch
import torch.optim as optim

x = torch.randn(3, 4,requires_grad=True)
y=torch.cat((x,x))
L = torch.randn(3, 4)
L[0,0]=x[0,0]
f=torch.flatten(x)

t=y.sum()
t.backward()
opt = optim.Adam(, lr=0.1)
print(β€œx”,x)
print(β€œf”,f)
print(β€œy”,y)
print(β€œL”,L)
opt.step()
print(β€œ========================”)
print(β€œ========================”)
print(β€œx”,x)
print(β€œf”,f)
print(β€œy”,y)
print(β€œL”,L)

Thanks! :slight_smile:

If you begin with two separate tensors they will live in different places in memory so it is not possible to combine them into a single tensor without copying them both into new tensor.

One trick to do this could be to first begin with the big tensor, and then take small views of that tensor.

import torch
import torch.optim as optim

# some dummy tensors that have the same shape as `a` used to initialize `x`
_a = torch.tensor(1.)
_b = torch.tensor(2.)
x = torch.vstack((_a, _b))

# now we finally create `a`
a = x.select(0, 0)
a.requires_grad_(True) # a.is_leaf is True
opt = optim.Adam([a]) # optimize `a`

loss = a ** 2
loss.backward()

print(a)
print(x)

opt.step()

print(a)
print(x) # should be updated as well

The trick is basically to find the inverse operation of whatever you’re doing. So for arbitrary fn (where x = fn(a)), make sure you find fn2 such that `a = fn2(fn(a))

1 Like

Hi,

the problem is that instead of _a and _b i have the model parameters which are already defined and they should also be optimized (in this case _a und _b do not, see my small modification to you code). So i want as a summary that the model weights are getting optimized using autograd and my matrix contains the new weight values automatically.

This code with the small modification works but with a convolution layer it is complicated or even not possible, because my matrix contains the 2D representation of the weight tensor plus some modifications (padding between the rows and columns etc), e.g. let w0,w1,w2,w3 be the model weights (convolutional layers with many channels) than my matrix is e.g. defined as [[W0,W1],[W2,W3]] s.t. W0 is a modification of w0 incl. reshaping to get 2D matrix + padding between the rows and columns etc, and my goal ist that W0,W1,W2,W3 get updated when w0,w1,w2,w3 do. I don’t want to define every time my matrix to save execution time.

your modified code:

import torch
import torch.optim as optim
# some dummy tensors that have the same shape as `a` used to initialize `x`
_a = torch.tensor(1.)
_b = torch.tensor(2.)
x = torch.vstack((_a, _b))

# now we finally create `a`
a = x.select(0, 0)
_a=a
a.requires_grad_(True) # a.is_leaf is True
opt = optim.Adam([a]) # optimize `a`

loss = a ** 2
loss.backward()

print(_a)
print(a)
print(x)

opt.step()

print(_a) # should be updated as well
print(a)
print(x) # should be updated as well

Here an example of my goal:

import torch
import torch.optim as optim

# original weight matrix
a = torch.tensor([[[1.,2],[3.,4]], 
                  [[1.,1],[2.,2]]]) # shape [2,2,2]


# create a modificated weight matrix from the original one
a_new=torch.cat((a[0],torch.zeros(2,2),a[1])) #shape[6,2]



a.requires_grad_(True) # a.is_leaf is True
opt = optim.Adam([a]) # optimize `a`

loss = a.sum()
loss.backward()

print("a",a)
print("a_new",a_new)

opt.step()
print("================")
print("================")


print("a",a)
print("a_new",a_new) # should be updated as well