How to train several network at the same time?


Hi, I am just a beginner of PyTorch, even a beginner of Python. So this question may too simple for you but I really want to get some help.
I want to train a neural network which is included sub-networks. It looks something like that.

In my opinion, this network could be implemented by constructing three simple fully-connection neural networks.
My implementation is:

#firstly, define the simple NN
class Net(nn.Module):
    def __init__(self,n_input,n_hidden1,n_hidden2):
        super(Net, self).__init__()
        self.hidden1 = nn.Linear(n_input,n_hidden1)
        self.hidden2 = nn.Linear(n_hidden1,n_hidden2)
        self.output  = nn.Linear(n_hidden2,1)

    def forward(self,x):
    x = F.tanh(self.hidden1(x))
    x = F.tanh(self.hidden2(x))
    x = self.output(x)
    return x

#construct 3 same nets
net = []
for i in range(3):

# define loss function and optimizer
optimizer = []
for i in range(3):
    optimizer.append(torch.optim.SGD(net[i].parameters(), lr = 0.2))
loss_func = torch.nn.MSELoss()

#training neural network
for step in range(300):  #training circles
    for i in range(3):
        output[:, i] = net[i](input[i, :, :])
    prediction = torch.sum(output,1)

    loss = loss_func(prediction,target)
    for i in range(3):
    for i in range(3):

after several circles, all output value become to the same value.
There must something wrong with my code. Could anyone give me some help?
Or are there any better way to implement this network?
Thanks a lot.

(Hugh Perkins) #2

according to your diagram, you want to train one network that contains 3 sub networks?

but in your code, you are creating 3 separate optimizers.

I cant help wondering:

  1. why are you creating 3 optimiziers?
  2. what happens if you use just one optimizer instead?


Oh, I have tried using only one optimizer. But
net = []
is a list but not torch.autograd.Variable, So there are no parameters in net. PyTorch reported as an error for one optimizer.

That’s why I used 3 optimizers.

(Hugh Perkins) #4

sure. well, you have a couple of options. Because I htink you should be using only one optimizer

  • the more idiomatic solution might be to create another nn.Module child class, to hold the three Net modules. The rest should be straightforward

  • otherwise you can simply do something like:

    parameters = set()
    for net_ in net:
    parameters |= set(net_.parameters())


Thanks for replying, I will try these solutions and will give you a feedback.


Hi, I have just tried the second solution. It works well!

But the problem, output always become same value, still exists.

Fortunately, I solve this problem by adding an extra Batch_Normalization layer.

Thank you very much!


Would the network learn different weights when using these seperate optimizers which all optimize the same loss, or is it just a matter of style or performance?