How to set different learning rates for different part of the same tensor?

Is it possible to set different learning rates for different part of a tensor?

That’s not possible as you would be hitting the same limitations explained in your other topic.

So, the only way to have a different learning rates is to manually scale the gradients on the backward pass. Right?

No, you could also use different (sub-)tensors and stack/concatenate them in the actual forward pass as explained e.g. here.

1 Like