Validation Loss Decreasing but Training Loss Fluctuating

vdw · March 24, 2024, 8:35am

2 things I’ve noticed:

While I don’t know your loss function, Softplus is generally not suitable for the output layer as the results cannot be interpreted as probabilities like with Softmax
You have the line x = x.view(-1, 50 * 47) which might be correct, but it is a common trap for subtle errors. You might want to double check that you’re doing the correct tensor manipulation here