Adam various learning rates

Hello,

I’d like to specify various learning rates for differents layers in my optimizer. The thing is that my network has two separated outputs
Here’s the code:

class AC (nn.Module):

def __init__(self, env_infos):

	nn.Module.__init__(self)
	self.env_infos = env_infos
	self.p1 = nn.Linear(env_infos[0], 10)
	self.p2 = nn.Linear(10, env_infos[1])

	self.v1 = nn.Linear(env_infos[0], 10)
	self.v2 = nn.Linear(10, 1)
            
            # I first tried 
	# self.a_1 = optim.Adam([self.p1, self.p2], 5e-3)
	# self.a_2 = optim.Adam([self.v1, self.v2], 1e-2)

	self.a_1 = optim.Adam([self.p1.parameters(), self.p2.parameters()], 5e-3)
	self.a_2 = optim.Adam([self.v1.parameters(), self.v2.parameters()], 1e-2)

Since the optimizer works only with Variables, should I pass it the state_dict ?

Thanks !

The state_dict might contain other stuff too.

The problem is that [self.p1.parameters(), self.p2.parameters()] is a list containing two generators, not a list containing a bunch of parameters.

Replace with one of these

  • [*self.p1.parameters(), *self.p2.parameters()]
  • list(self.p1.parameters()) + list(self.p2.parameters())
1 Like

For your use case, the Per-Parameter Options section in the pytorch documentation might have good info, too.

Best regards

Thomas

1 Like

Yop, works like this ! Thanks !