I’m pretty new to ML in general, and I’m wondering if there’s something wrong (either conceptually or practically) with my code
I’m feeding input through some CNNs, flattening layers, some hidden layers and eventually output to a 1D vector of some dimension (all strictly positive).
I want the model to fit to a particular index of the output vector, therefore implemented the following weighted MSE:
class Weighted_MSE(nn.Module): def __init__(self, weights): self.weights = weights super(Weighted_MSE, self).__init__() def forward(self, inputs, targets): MSEW = torch.mean(self.weights * (inputs - targets) ** 2) return MSEW
If I want the model to fit to output then the weights will look like:
tensor([4., 1., 1., 1., 1., ....])
But in practice, what I found out is that if the weights are small, the model behaves like a regular MSE, Once the weight increases beyond certain threshold, the model no longer fits to the output index, and instead just always output ‘0’ on that index.
This is exactly the opposite of what I expected to happen. Because all the training output values are strictly greater than 0, this seems to amplify the training errors, rather than reducing them.
Any help is appreciated!