# image segmentation with cross-entropy loss

I am a new user of Pytorch.
I’d like to use the cross-entropy loss function

number of classes=2
output.shape=[4,2,224,224]
output_min=tensor(-1.9295)]
output_max=tensor(2.6400)]

number of channels=3
target.shape=[4,3,224,224]
targets_max=tensor(-2.1008)]
targets_min=tensor(-2.1179)]

how to evaluate:
loss = criterion(output, target)?
Thanks.

Hello Neo!

As an aside, for a two-class classification problem, you will be
better off treating this explicitly as a binary problem, rather than
as a two-class instance of the more general multi-class problem.
To do so you would use `BCEWithLogitsLoss` (“Binary Cross
Entropy”), rather than the multi-class `CrossEntropyLoss`.

But you can certainly treat this as a general multi-class problem,

So your outputs are raw-score logits, rather than probabilities
that lie between `0.0` and `1.0`. Good.

I assume that your first dimension, `4`, is your batch size. This is
fine.

It probably does not make sense to have a channels dimension in
your target. (If you think it does, you should further explain your
use case.) In any event, as it stands, this target shape won’t match

If your output shape is `[nBatch, nClass, height, width]`,
then (for `CrossEntropyLoss`) your target shape must be
`[nBatch, height, width]`, with no `nClass` dimension.

This is wrong (for `CrossEntropyLoss`). Your target values must
be integer (`long`) class labels that run from `0` to `nClass - 1`,
so in your two-class case, that take on the values `0` and `1`.

If you could explain a little more where `target` comes from, and
what the numbers actually mean, we can help sort this out.

Best.

K. Frank

I’m sorry for the delay i had problems
and I’m sorry for my English

my model is:

## number of classes=2 or 3 or 10

And the output dimension of the model is [No x Co x Ho x Wo]
where,

``````No -> is the batch size (same as Ni)
Co -> is the number of classes that the dataset have!
Ho -> the height of the image (which is the same as Hi in almost all cases)
Wo -> the width of the image (which is the same as Wi in almost all cases)
``````

number of channels=1 or 3

the target dimension is [Ni x Ci x Hi x Wi]
where,

``````Ni -> the batch size
Ci -> the number of channels (which is 3 or 1)
Hi -> the height of the image
Wi -> the width of the image
``````

I apply train transformation into image and mask:

train_transform = et.ExtCompose([
#et.ExtResize(size=opts.crop_size),
et.ExtRandomScale((0.5, 2.0)),
et.ExtRandomHorizontalFlip()
et.ExtToTensor(),
et.ExtNormalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
])

mask is rgb or gray with values ​​of 0 background and 1class1 2class2

Hello Neo!

`Ho` and `Wo` must be the same as `Hi` and `Wi` in all cases, not
just in “almost all” cases.

This might be what you have, but it simply won’t work.
`CrossEntropyLoss` requires that, for a model output of shape
`[No, Co, Ho, Wo]`, the target have shape `[No, Ho, Wo]`
(and that the values of the target are integer class labels that
run from `0` to `Co - 1`).

If your target has this extra “channel” dimension (`Ci`), it won’t
work (and `Hi` and `Wi` must match `Ho` and `Wo`, as well). (Just
to be clear, you can’t have the `Ci` dimension at all, even if
`Ci = 1`.)

Good luck.

K. Frank

Hey sorry to disturb you, just wanted to confirm -
1. the raw logits are supposed to be one-hot encoded - say as a sample shape of `(1, 6, 256, 256)` if one has multi-class classification w/ `6` labels
I am confused why pytorch doesn’t do this implicitly though was my understanding correct?