Two optimizers for two neural networks?

Hi,
I am using two fully connected NNs (net_1 and net_2). Input for second NN is based on the output of first NN. Loss is calculated using outputs of both NNs. Generic code for that purpose is given as:

net_1 = Net_u()
net_2 = Net_nu()
optimizer_u = Adam(net_1.parameters(), lr=0.001)
optimizer_nu = Adam(net_2.parameters(), lr=0.001)

optimizer_u.zero_grad()
optimizer_nu.zero_grad()

u = net_1(y)
gradients = torch.ones(u.shape[0], u.shape[1])
u_y = grad(u, y, grad_outputs=gradients, create_graph=True)[0]
nu = net_2(u_y)

loss = loss_fn(u, nu)
loss.backward()
optimizer_u.step()
optimizer_nu.step()

Can anybody kindly tell if this is being done right and two NNs are being learned without any bias or issue?
Is there any better way of doing it?

Thanks!

@muhammadirfanzafar

In your solution, your input into the second model is the gradient of the first

A simpler solution can be the following

class Net_u(nn.Module):
    def __init__(self):
        super(Net_u, self).__init__()
        self.layer = nn.Linear(4, 5)
    
    def forward(self, x):
        x= self.layer(x)
        return x
    
class Net_nu(nn.Module):
    def __init__(self):
        super(Net_nu, self).__init__()
        self.layer = nn.Linear(5, 1)
    
    def forward(self, x):
        x= self.layer(x)
        return x

net_1 = Net_u()
net_2 = Net_nu()
optimizer_u = optim.Adam(net_1.parameters(), lr=0.001)
optimizer_nu = optim.Adam(net_2.parameters(), lr=0.001)

optimizer_u.zero_grad()
optimizer_nu.zero_grad()

y = torch.rand(2, 4)
loss = torch.nn.MSELoss()

u = net_1(y)
nu = net_2(u)

loss = loss(u, nu)
loss.backward()
optimizer_u.step()
optimizer_nu.step()

My question was about use of optimizers. Defining two optimizers separately for each network is fine?