As an aside, you probably don’t want to use MSELoss for a
multi-class problem.
out.sum(0) is summing over the batch dimension of your input / target (even if your batch size is 1).
Given what you say, I speculate that input (the output of your
model) is a vector of 61 values, one for each class (for a single
sample in your batch), and that target is also a vector of 61
values (perhaps your class labels one-hot encoded).
If you want to do this (and your probably don’t), you should be
using out.sum() to sum over all elements of the out tensor,
that is, over both the batch and class dimensions.
What I really mean is that you should reorganize your problem a
little bit and use BCEWithLogitsLoss.
However, if you have a good reason to be using MSELoss (and I’m
right that your input and target have shape [nBatch, nClass]),
then, yes, you should return out.sum() (rather than out.sum (0))
so that you will be summing over classes as well as the samples in
your batch.