How to implement IOU for mutli-class image segmentation?
I have found this one working good https://www.kaggle.com/iezepov/fast-iou-scoring-metric-in-pytorch-and-numpy
I wrote an extensive IoU notebook for a talk on the JIT linked from https://lernapparat.de/pytorch-jit-android/ . The usual way is to do “class agnostic” IoU and a standard classification loss (eg cross entropy), so multiclass happens only in the second.
I always recommend the SSD lecture from https://fast.ai/ 's second course.
Best regards
Thomas
Hi @tom, I want to calculate IoU where my labels are of dimension [batch, class, h, w]
and I have 4 classes. Initially I had 4 masks per image and I stacked them together to form the above mentioned dimension. Now I’m having difficulty in calculating IoU per class. Can you help me with that ?
I’m not sure I understand your labelling.
- I think you want 4 coordinates (x0, y0, h, w).
- Are you saying that you have class as an array dimension or do you have class as a separate tensor for each area?
- But then computing the IoU per item should be straightforward.
- If you don’t have
class
as an array dimension, but have that as an extra label, usingindex_add_
using the class as index should help you consolidate per-item IoU to something that hasclass
as an array dimension.
Best regards
Thomas
Hi Thomas, I’ll be more clear.
- So for 1 input image, I have 4 masks that correspond to 4 classes.
- I converted each mask to binary (0/1)
- I stacked these masks to have this dimension
[batch, 4, h, w]
- Now I want to calculate IoU for each class.
- But w, h doesn’t have location information?
- What’s the shape of your predictions? Both for area and class?
Best regards
Thomas
My logits are of the shape [batch, 4, h, w]
as well and then I’m doing
preds = torch.sigmoid(logits) > 0.5
to get the predictions.
The shape of the predictions is also [batch, 4, h, w]
I’ll have to pass with these.
Best regards and all the best for your project
Thomas
This only works for binary segmentation.
I think IOU, per-pixel mean-IOU and per-image mean-IOU are often mis-interpreted.
For training, the code above is OK, but for report per-pixel mean-IOU, you can have a look at: PyTorch-ENet/confusionmatrix.py at a67d048ec837849eb79dfb8ec51b629a9738b362 · davidtvs/PyTorch-ENet · GitHub
Thanks for the reply. I will go through the shared link