Hello everybody, I’m writing here to ask some opinions about my situation. I’ve been using pytorch and pytorch geometric for my deep learning architecture on a multi-regression with two targets. Everything looks pretty ok, except that, during training process, my validation loss keeps being lower than training one.

I’m attaching here my train and validation `for`

loops:

```
def train_model(model, train_loader,val_loader,lr):
"Model training"
epochs=50
model.train()
train_losses = []
val_losses = []
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=lr, weight_decay=1e-5)
#Reduce learning rate if no improvement is observed after 10 Epochs.
#scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, 'min', patience=2, verbose=True)
for epoch in range(epochs):
loss_batches=[]
for data in train_loader:
y_pred = model.forward(data)
loss1 = criterion(y_pred[:, 0], data.y[0])
loss2 = criterion(y_pred[:,1], data.y[1])
train_loss = 0.8*loss1+0.2*loss2
optimizer.zero_grad()
train_loss.backward()
optimizer.step()
loss_batches.append(train_loss.item())
train_losses.append(sum(loss_batches)/len(loss_batches))
with torch.no_grad():
loss_batches=[]
for data in val_loader:
y_val = model.forward(data)
loss1 = criterion(y_val[:,0], data.y[0])
loss2 = criterion(y_val[:,1], data.y[1])
val_loss = 0.5*loss1+0.5*loss2
loss_batches.append(val_loss.item())
val_losses.append(sum(loss_batches)/len(loss_batches))
print(f'Epoch: {epoch}, train_loss: {train_losses[epoch]:.3f} , val_loss: {val_losses[epoch]:.3f}')
return train_losses, val_losses
```

I’m perplexed about the correctness of my methodology, especially about the indentation of my code, but I really can’t figure out what can be wrong. I’m also attaching the information printed out during training for each epoch: (as you can see from the code I’m displaying a final train/val loss that is just the average of all train/val losses of all batches)

```
Epoch: 0, train_loss: 7.378 , val_loss: 5.690
Epoch: 1, train_loss: 5.618 , val_loss: 3.611
Epoch: 2, train_loss: 2.789 , val_loss: 1.831
Epoch: 3, train_loss: 2.037 , val_loss: 1.623
Epoch: 4, train_loss: 1.850 , val_loss: 1.502
Epoch: 5, train_loss: 1.682 , val_loss: 1.400
Epoch: 6, train_loss: 1.536 , val_loss: 1.331
Epoch: 7, train_loss: 1.440 , val_loss: 1.295
Epoch: 8, train_loss: 1.387 , val_loss: 1.273
Epoch: 9, train_loss: 1.356 , val_loss: 1.256
Epoch: 10, train_loss: 1.335 , val_loss: 1.242
Epoch: 11, train_loss: 1.319 , val_loss: 1.230
Epoch: 12, train_loss: 1.306 , val_loss: 1.218
Epoch: 13, train_loss: 1.295 , val_loss: 1.206
Epoch: 14, train_loss: 1.284 , val_loss: 1.192
Epoch: 15, train_loss: 1.273 , val_loss: 1.175
Epoch: 16, train_loss: 1.261 , val_loss: 1.157
Epoch: 17, train_loss: 1.250 , val_loss: 1.139
Epoch: 18, train_loss: 1.239 , val_loss: 1.119
Epoch: 19, train_loss: 1.227 , val_loss: 1.098
Epoch: 20, train_loss: 1.216 , val_loss: 1.076
Epoch: 21, train_loss: 1.204 , val_loss: 1.053
Epoch: 22, train_loss: 1.192 , val_loss: 1.030
Epoch: 23, train_loss: 1.181 , val_loss: 1.009
Epoch: 24, train_loss: 1.171 , val_loss: 0.989
Epoch: 25, train_loss: 1.161 , val_loss: 0.972
Epoch: 26, train_loss: 1.153 , val_loss: 0.958
Epoch: 27, train_loss: 1.146 , val_loss: 0.946
Epoch: 28, train_loss: 1.140 , val_loss: 0.937
Epoch: 29, train_loss: 1.135 , val_loss: 0.930
Epoch: 30, train_loss: 1.131 , val_loss: 0.924
Epoch: 31, train_loss: 1.128 , val_loss: 0.919
Epoch: 32, train_loss: 1.124 , val_loss: 0.915
Epoch: 33, train_loss: 1.122 , val_loss: 0.911
Epoch: 34, train_loss: 1.119 , val_loss: 0.908
Epoch: 35, train_loss: 1.117 , val_loss: 0.905
Epoch: 36, train_loss: 1.114 , val_loss: 0.902
Epoch: 37, train_loss: 1.112 , val_loss: 0.899
Epoch: 38, train_loss: 1.110 , val_loss: 0.897
Epoch: 39, train_loss: 1.108 , val_loss: 0.894
Epoch: 40, train_loss: 1.106 , val_loss: 0.892
Epoch: 41, train_loss: 1.105 , val_loss: 0.890
Epoch: 42, train_loss: 1.103 , val_loss: 0.888
Epoch: 43, train_loss: 1.102 , val_loss: 0.886
Epoch: 44, train_loss: 1.100 , val_loss: 0.885
Epoch: 45, train_loss: 1.099 , val_loss: 0.883
Epoch: 46, train_loss: 1.097 , val_loss: 0.881
Epoch: 47, train_loss: 1.096 , val_loss: 0.880
Epoch: 48, train_loss: 1.095 , val_loss: 0.878
Epoch: 49, train_loss: 1.093 , val_loss: 0.876
```

The funny thing is, that apart from this strangeness, the model seems to work, since considering completely new unseen data in my `test_loader`

, the predictions appear to be pretty accurate, and I’m able to get to an `r2_score`

of 0.52 for the first target and 0.72 for the second one. I appreciate anybody providing his/her opinion about this situation

Many thanks,

Federico