I am new to PyTorch and am facing an issue with training a feed forward neural network (ffnn), which I want to use for inverse predictions.
I want to create two ffnn where the outputs of the 1st ffnn are used as inputs for the 2nd ffnn. The input of the 1st ffnn are then compared with the output of the 2nd ffnn and hopefully are identical after training (this procedure has been recommended by literature).
What I have done so far:
I trained the 2nd ffnn separately and the results seem reasonable. I imported the 2nd ffnn and locked its weights and biases.
net_regular = torch.jit.load('regular_net.pt') net_regular.eval() for param in net_regular.parameters(): param.requires_grad = False normalization_values = pd.read_csv("regular_net_normalization_values.csv" , sep=",")
Then I created the 1st ffnn (net has been trimmed as shown here to reduce code length a bit)
class Net(nn.Module): def __init__(self): super().__init__() self.fc1 = nn.Linear(3, 50) self.fc2 = nn.Linear(50, 50) self.fc3 = nn.Linear(50, 50) self.fc4 = nn.Linear(50, 50) self.fc5 = nn.Linear(50, 6) def forward(self, x): x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = F.relu(self.fc3(x)) x = F.relu(self.fc4(x)) x = self.fc5(x) return x
and connected them via the output of the 1st ffnn to the input of the 2nd ffnn. The combined ffnn are then trained (I hope that only the weights of the 1st ffnn are changed, as the 2nd ffnn is supposed to remain unchanged, as they are locked). I calculate the loss between the input of the 1st ffnn and the output of the 2nd ffnn. The loss seems unreasonably small right at the start of the training (“Loss tensor(0.0011, grad_fn=)”.
net_inverse = Net().to(DEVICE) optimizer = optim.Adam(net_inverse.parameters(), lr=0.001) loss = nn.MSELoss() for epoch in range(5): for data in set_train: X, y = data net_inverse.zero_grad() output1 = net_inverse(X.to(DEVICE)) output2 = net_regular(output1.to(DEVICE)) l = loss(output2.to(DEVICE),X.to(DEVICE)) l.backward() optimizer.step()
I then save the 1st ffnn as this is the net of interest to me, but the results of the 1st ffnn are unreasonable when I test the ffnn. I am afraid that no learning has taken place.
Are the ffnn connected appropriately?
How can I make sure that only the 1st ffnn is trained?
Thank you in advance!