Passing part of the NN with one optimizer, and all other with another optimizer

MrPositron · January 9, 2021, 4:02pm

I have a seq2seq with encoder and decoder. The structure of the architecture is as follows:

# initialize encoder
class Encoder(nn.Module):
    ...
# initialize decoder
class Decoder(nn.Module):
    ...
# initialize seq2seq model
# we pass the decoder and encoder to it
class Seq2Seq(nn.Module):
   ...

Then, we create the network as follows:

encoder_net = Encoder(...)
decoder_net = Decoder(...)
model = Seq2Seq(encoder_net, decoder_net, ...)

In decoder network, there is a quite complicated network, let say on top of the nn.GRU(). Thus, I want to optimize this part with one LR, and everything other that that with another LR. How can I do that?

Can I do it like this?

optim.SGD([
                {'params': model.decoder_net.layer_x.parameters(), 'lr':1e-4},
                {'params': model.parameters()}
            ], lr=1e-3, momentum=0.9)

MrPositron · January 9, 2021, 4:11pm

ups. I can just pass them as a list to the optmizer.