# Classes, Accuracy, and Loss in Pixel-wise using Masks

KFrank · May 16, 2020, 1:07am

Hello Ryan!

Since you’re using a version of “UNet” and the first layer of your
model is:

Conv2d(1, 32, kernel_size=k_size, padding=pad)

it looks like your model is expecting inputs of shape
[nBatch, nChannel, height, width], where nChannel = 1.
This makes sense for a grayscale image. (If your model were set up
for color images, you would probably have nChannel = 3 for the
three rgb channels.)

What was the shape of a batch of images to input to your model
before you did any reshaping, etc.? What was the shape of the
output of your model before reshaping? What is the shape of a
batch of masks that you pass to your loss function?

The point is that “UNet” typically carries along a “channel”
dimension, which, in your case, appears to be of size 1 for both
the input and output of your model.

BCEWithLogitsLoss requires that the shape of its input (the
output of your model) and its target (your mask) be the same.
They can both have this “singleton” nChannel = 1 dimension,
but, if so, they both have to have it.

This looks wrong. reduction='none' means don’t sum (or
average) the loss over the elements of output (and target).
But, for backpropagation, you want a single scalar loss, so you
should use the default reduction = 'mean'.

Best.

K. Frank