Question about how to make the partial tensor trainable?

Hi,

I recently encountered a problem. I have a learnable matrix

self.learnable_A = nn.Parameter(torch.empty(n, n))

and an unlearnable matrix (just a tensor):

self.unlearnable_B = torch.randn(m, m)

where m >> n holds. Moreover, we also have an index vector v (list, with the size n)

v = [2, 5, 11, 1, ..., 19]

where each element v_i in the list v satisfies 0<=v_i<=m.

Target: We want to insert the self.learnable_A into the self.unlearnable_B according to the v (cols=v and row=v), and keeps the element from the self.learnable_A trainable (.requires_grad=True) while the element from the self.learnable_B untrainable (.requires_grad=False).

I have tried many ways to do this but can’t. Is this possible?

Hi Antonio!

You may use pytorch tensor indexing to write A into the
desired locations of B. (I don’t understand specifically what
you mean by “cols=v and row=v”.)

After you write A into B the “non-leaf” tensor B will carry
requires_grad = True and depend on the trainable leaf
tensor A. So you will be able to backpropagate through B
to A and train A using those backpropagated gradients.

Even though B carries requires_grad = True (after being
written into), it is not a leaf tensor, so it isn’t “trainable.”

(Note that things like requires_grad and being a leaf tensor
apply to entire tensors as a whole and not to individual
elements of tensors. But the non-leaf tensor B is computed
from the leaf tensor A and all of A is “trainable” and carries
requires_grad = True, which is what you want.)

Consider:

>>> import torch
>>> print (torch.__version__)
2.4.1
>>>
>>> n = 5
>>> m = 11
>>>
>>> A = (torch.arange (n * n).float() + 1).reshape (n, n)
>>> A.requires_grad = True
>>> A
tensor([[ 1.,  2.,  3.,  4.,  5.],
        [ 6.,  7.,  8.,  9., 10.],
        [11., 12., 13., 14., 15.],
        [16., 17., 18., 19., 20.],
        [21., 22., 23., 24., 25.]], requires_grad=True)
>>>
>>> B = torch.zeros (m, m)
>>> B
tensor([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
>>>
>>> v = torch.tensor ([1, 3, 5, 7, 9])
>>> ind = v.repeat (n, 1)
>>> ind
tensor([[1, 3, 5, 7, 9],
        [1, 3, 5, 7, 9],
        [1, 3, 5, 7, 9],
        [1, 3, 5, 7, 9],
        [1, 3, 5, 7, 9]])
>>>
>>> B[ind.T, ind] = A
>>> B
tensor([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
        [ 0.,  1.,  0.,  2.,  0.,  3.,  0.,  4.,  0.,  5.,  0.],
        [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
        [ 0.,  6.,  0.,  7.,  0.,  8.,  0.,  9.,  0., 10.,  0.],
        [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
        [ 0., 11.,  0., 12.,  0., 13.,  0., 14.,  0., 15.,  0.],
        [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
        [ 0., 16.,  0., 17.,  0., 18.,  0., 19.,  0., 20.,  0.],
        [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
        [ 0., 21.,  0., 22.,  0., 23.,  0., 24.,  0., 25.,  0.],
        [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]],
       grad_fn=<IndexPutBackward0>)

Best.

K. Frank

Thank you very much! As you said, non-leaf nodes are not trainable, my problem is solved. Thanks again!