I’d like to train a U-net which must accept a 3 channel image and return a 3 channel image.
In a nutshell, the loss should accept as input B x C x H x W and as target B x C x H x C.
This kind of network has for goal standard image processing, that’s what justify this approach.
Im actually getting a coherent error :
invalid argument 1: only batches of spatial targets supported (3Dtensors) but got targets of dimension: 4 at /opt/conda/conda-bld/pytorch_1518243271935/work/torch/lib/THCUNN/generic/SpatialClassNLLCriterion.cu:14
Hope I was clearly explaining my problem and hope for some feedbacks.
What kind of output do you expect?
NLLLoss, it seems you would like to perform a classification.
In this case, your output has to have dimensions
[B, H, W] with the corresponding class indices.
thanks for your ansewer
i try the NLLLoss2D() and a CrossEntropyLoss(). What loss i must us to have a [B, C, H, W] as output?
It depends on your use case, so what kind of output are you expecting?
Is it a segmentation task with multi-labels? If so, you could use
Alternatively you could try
I try the nn.BCELoss but the BCEWithLogitsLoss() work much better.
As a side note, you should use
- sigmoid +
- no non-linearity (just logits) and
If you still have the
log_softmax from using
NLLLoss, you should replace or remove it.