Hello, I am trying to train a convolutional neural network:
class ConvAutoencoder(nn.Module): def __init__(self): super(ConvAutoencoder, self).__init__() # Encoder self.conv1 = nn.Conv2d(1, 16, 3, padding=1) # batch_size, channels, height, width # 32/64, 2, 10,8 self.conv2 = nn.Conv2d(16, 4, 3, padding=1) self.pool = nn.MaxPool2d(2, 2) # Decoder self.t_conv1 = nn.ConvTranspose2d(4, 16, 2, stride=2) self.t_conv2 = nn.ConvTranspose2d(16, 1, 2, stride=2) def forward(self, x): x = F.relu(self.conv1(x)) x = self.pool(x) x = F.relu(self.conv2(x)) x = self.pool(x) x = F.relu(self.t_conv1(x)) x = F.sigmoid(self.t_conv2(x)) return x
With the train function:
n_epochs = 10000 optimizer = torch.optim.Adam(model.parameters(), lr=0.0002) criterion = nn.MSELoss() train_matrix_array = get_matrix(train_dataset, img_height=8, batch_size=2) test_matrix_array = get_matrix(test_dataset, img_height=8, batch_size=2) for epoch in range(1, n_epochs + 1): # monitor training loss train_loss = 0.0 # Training for data in train_matrix_array: images = data images = images.to(device) optimizer.zero_grad() outputs = model(images) # 32x3 tensors # print("outputs: ", outputs , " images: ", images) loss = criterion(outputs, images) loss.backward() optimizer.step() train_loss += loss.item() * images.size(0) train_loss = train_loss / len(train_matrix_array)
The problem is that, in the input there are a lot of tensors with negative values, but the output only gets positive values, these are getting smaller and smaller, I understand that because they want to go to negative values, but they never become negative. I understand that this also affects the values that are positive and also in their predictions the tensors are very small.
After a bit of training the values of the real input and the predictions look like this:
Prediction: tensor([1.4951e-20, 1.7295e-21, 1.8791e-31, 1.5755e-20, 2.2758e-29, 3.1556e-31, 0.0000e+00, 1.6123e-28], grad_fn=<SelectBackward0>) Real values: tensor([-0.3655, 0.1866, -0.0160, -0.0876, 0.5151, -1.2745, -0.0676, -2.1106])
I assume it has to do with the last layer of the model but I don’t know which is more appropriate.