I’m dealing with a regression task by training a CNN with 334x334 satellite images. As I mentioned in my previous posts, I use MSE loss along with Adam optimizer, and the loss fails to converge. I now realize the reason why the loss fails to converge is that it only learns the mean of the targets. For example, if I feed it with 100 data points with mean of target = 7, then all its predictions will be very close to 7 (as shown in the picture below, where x-axis is the true value, and y-axis is the predicted value):
One of my guesses in this case it that my CNN does not learn any visual structures from the image, so I try different architectures such as ResNet and AlexNet, but the problem still exists. How can I prevent my network from this behavior?