How to set different learning rates for different part of the same tensor?

Qwwq · July 29, 2022, 4:37pm

Is it possible to set different learning rates for different part of a tensor?

ptrblck · July 30, 2022, 1:29am

That’s not possible as you would be hitting the same limitations explained in your other topic.

Qwwq · July 30, 2022, 7:30pm

So, the only way to have a different learning rates is to manually scale the gradients on the backward pass. Right?

ptrblck · July 31, 2022, 12:11am

No, you could also use different (sub-)tensors and stack/concatenate them in the actual forward pass as explained e.g. here.