# Different learning rates/hyper-parameters for every element of a tensor

I wanted to try out different sets of optimizer hyperparameters for each element of a tensor. I tried the following but I am getting a `non-leaf tensor` error, possibly because I am indexing the tensor:

``````y = torch.ones(3)
print(y)
opt2 = torch.optim.SGD([{'params':[y[0]],'lr':0.1},{'params':[y[1]],'lr':1},{'params':[y[2]],'lr':10}])
loss2 = y.sum()
loss2.backward()
opt2.step()
print(y)
``````

ValueError Traceback (most recent call last)
in ()
3 print(y)
----> 4 opt2 = torch.optim.SGD([{‘params’:[y[0]],‘lr’:0.1},{‘params’:[y[1]],‘lr’:1},{‘params’:[y[2]],‘lr’:10}])
5 loss2 = y.sum()

/usr/local/lib/python3.6/dist-packages/torch/optim/sgd.py in init(self, params, lr, momentum, dampening, weight_decay, nesterov)
62 if nesterov and (momentum <= 0 or dampening != 0):
63 raise ValueError(“Nesterov momentum requires a momentum and zero dampening”)
—> 64 super(SGD, self).init(params, defaults)
65
66 def setstate(self, state):

/usr/local/lib/python3.6/dist-packages/torch/optim/optimizer.py in init(self, params, defaults)
41
42 for param_group in param_groups:
44
45 def getstate(self):

191 "but one of the params is " + torch.typename(param))
192 if not param.is_leaf:
–> 193 raise ValueError(“can’t optimize a non-leaf Tensor”)
194
195 for name, default in self.defaults.items():

ValueError: can’t optimize a non-leaf Tensor

Hi,

When you do `y[0]`, the Tensor you get is not a leaf tensor anymore.
Remember a leaf Tensor is one that you created with required_grad=True (and so is not the result of an operation).
You can only optimize leaf Tensors.

If you want to use builint optimizers, you will need to create one Tensor for every parameter and combine them during the forward pass:

``````# Create like this

opt2 = torch.optim.SGD([{'params':[y0],'lr':0.1},{'params':[y1],'lr':1},{'params':[y2],'lr':10}])

# During the forward pass:
y = torch.cat([y0, y1, y2], 0)
# The rest of your forward
``````
1 Like

i tried the snippet i am pasting and i think i will have to execute a `stack` or `cat` after each `optimizer.step` call. In the snippet below the changes in the leaf tensors dont seem to be communicated to the `cat`/`stack`, while changes to the `stack`/`cat` are communicated to the leaf:

``````a = torch.randn(1,2)
b = torch.randn(1,2)
c = torch.cat([a,b],0)

print(a)
a.data = a.data+1
print(a)
print('*'*50)
print(c)
print('\n\n')
c.data[0,:] = c.data[0,:] *10
print(c,a)
``````

tensor([[ 0.5121, -0.8800]])
tensor([[1.5121, 0.1200]])
`**************************************************`
tensor([[ 0.5121, -0.8800],
[-0.6470, 0.2288]])

tensor([[ 5.1207, -8.7999],
[-0.6470, 0.2288]]) tensor([[1.5121, 0.1200]])

Hi,

As I said, you will need to run `cat` for each forward pass.

The changes are never “communicated” to the input of cat or stack, these are out of place operations.
The weird behavior you see when you print a in the end is just because you changed a first.