Is it ok to pass the same parameter to optimizer multiple times?

I’m thinking of writing an encoder-decoder, with shared embedding weights, like:

embedding = nn.Embedding(input_size, hidden_size)
encoder = Encoder(embedding=embedding)
decoder = Decoder(embedding=embedding)

where eg Encoder __init__ would do:

def __init__(self, embedding):
    hidden_size = embedding.weight.size()[1]
    self.embedding = embedding
    self.enc_rnn = nn.RNN(hidden_size, hidden_size)

and then create the optimzer like:

opt = optim.Adam(
    list(embedding.parameters()) +
    list(encoder.parameters()) +

To what extent is having the embedding parameters passed to the optimizer multiple times ok/ not ok? Any better way to handle this, in idiomatic pytorch? I want to avoid allocating the embedding multiple times, then deallocating the ones we dont need, ideally.


Ok, I have done this like this for now, unless anyone has a better idea?

p = (
    set(encoder.parameters()) |
    set(decoder.parameters()) |
opt = optimizer_fn(p, lr=0.001)
  1. If I’m not missing something obvious, do you really need to explicitly pass the embedding.parameters() to the opt, since you already gave it encoder.parameters() which includes embedding.parameters()?
  2. The code for nn.Module.__setattr__() suggests that it will remove duplicate parameters passed to it, so it seems to me that
opt = optim.Adam(list(encoder.parameters()) + list(decoder.parameters()))

should be fine.

1 Like

You can still have two modules with dictionaries as parameters. Both modules will have once each of its parameters in the dict, but if the two dictionaries have common parameters, it will raise error

ValueError: some parameters appear in more than one parameter group
1 Like

So how I would do that is set up model classes as such:

Class Encoder which has its own inits and then decoder class that has its own inits

Finally you can set up container module class EncDec with inits that you specify equal inits From Encoder and decoder class contained in EncDec class.

Model = ENcDec(“whatever arguments you need”)

Optimizer = optim.adam(Model, lr = 0.0001)

how to resolve that problem, i really need to optimize two groups of params…