Multiple instances of same model!?

Hello,

I have a situation to work with multiple instances of the same model, like this:

class Decoder(nn.Module):
    pass

decoders = []
for _ in range(some_number):
    decoders.append(Decoder())

Some weird doubts:

  1. All the decoder instances are independent of each other as long as they don’t share a tensor. True/False?
  2. To save them to disk and reload them I’ll have to loop through and do something like this:
'decoder0': decoders[0].state_dict(),
.
.

is there a better way?

Primarily, I am trying to understand the independence amongst the decoders themselves, and the conditional dependence that comes in, when there is a shared model (like an encoder).

Cheers.

  1. Yes, all Decoder instances are independent, since you’ve initialized each one separately.
  2. In your current approach, yes. However, you could also use an nn.ModuleList, which will return all internal submodule states using its state_dict() method:
modules = nn.ModuleList()
for _ in range(10):
    modules.append(nn.Linear(1, 1))

modules.state_dict()
> OrderedDict([('0.weight', tensor([[-0.0277]])),
             ('0.bias', tensor([-0.3542])),
             ('1.weight', tensor([[0.2417]])),
             ('1.bias', tensor([0.2794])),
             ('2.weight', tensor([[0.6173]])),
             ('2.bias', tensor([0.7524])),
             ('3.weight', tensor([[-0.9020]])),
             ('3.bias', tensor([0.7507])),
             ('4.weight', tensor([[-0.2359]])),
             ('4.bias', tensor([0.6560])),
             ('5.weight', tensor([[-0.8661]])),
             ('5.bias', tensor([-0.9012])),
             ('6.weight', tensor([[0.7482]])),
             ('6.bias', tensor([0.6804])),
             ('7.weight', tensor([[0.7841]])),
             ('7.bias', tensor([-0.6375])),
             ('8.weight', tensor([[0.6187]])),
             ('8.bias', tensor([-0.3414])),
             ('9.weight', tensor([[0.2675]])),
             ('9.bias', tensor([0.0969]))])
2 Likes

Great! Thanks for that clarification.

And for the case when there is an encoder intrusion?

class Encoder(nn.Module):
    pass

class Decoder(nn.Module):
    pass

encoder = Encoder()

decoders = []
for _ in range(some_number):
    decoders.append(Decoder())

intermediate = encoder(input)

output0 = decoders[0](intermediate)
output1 = decoders[1](intermediate)
.
.

In this scenario, the decoders are still independent of each other, but the encoder is dependent on all the decoders. Correct?

Yes, each output tensor will be attached to a computation graph, which involves the same encoder instance. If you calculate some losses based on these outputs and call backward on them, the gradients will be accumulated in the parameters of encoder:

encoder = nn.Linear(1, 1)

decoders = nn.ModuleList()
for _ in range(3):
    decoders.append(nn.Linear(1, 1))

x = torch.randn(1, 1)
intermediate = encoder(x)

output0 = decoders[0](intermediate)
output1 = decoders[1](intermediate)
output2 = decoders[2](intermediate)

output0.backward(retain_graph=True)
print(encoder.weight.grad)
output1.backward(retain_graph=True)
print(encoder.weight.grad)
output2.backward()
print(encoder.weight.grad)
1 Like

Makes sense. Thank you : )

1 Like