SGD supports per-parameter learning rates, you just have to pass them individually when you instantiate the optimizer.
Here’s another relevant thread on tips for doing this in a convenient way.
SGD supports per-parameter learning rates, you just have to pass them individually when you instantiate the optimizer.
Here’s another relevant thread on tips for doing this in a convenient way.