How to save multi-class segmentation prediction as image?

Hi,

My multi-class UNET model output is of the following shape: [1, 6, 100, 100] which is expected because the batch size is 1, I have 6 classes, and the image size is 100x100.

How can I save a prediction as an image which contains all 6 classes using torchvision.utils. save_image? The tensor in save_image expects either 1 or 3 channels but my model technically outputs 6 channels for the 6 classes.

Thank you.

There might be a utility function somewhere that does this, but you can write your own function like:

    class_to_color = [torch.tensor([1.0, 0.0, 0.0]), ...]
    output = torch.zeros(1, 3, out.size(-2), out.size(-1), dtype=torch.float)
    for class_idx, color in enumerate(class_to_color):
        mask = out[:,class_idx,:,:] == torch.max(out, dim=1)
        mask = mask.unsqueeze(1) # should have shape 1, 1, 100, 100
        curr_color = color.reshape(1, 3, 1, 1)
        segment = mask*color # should have shape 1, 3, 100, 100
        output += segment

Thank you for your response. I get the following error when executing that code:

mask = mask.unsqueeze(1) # should have shape 1, 1, 100, 100
AttributeError: 'bool' object has no attribute 'unsqueeze'

since mask is a bool from mask = out[:,class_idx,:,:] == torch.max(out, dim=1).

What are we trying to accomplish from this line? How should we rewrite it to make mask a tensor and not a bool? Thank you!

Ah, that’s because max actually gives back the values and the indices; changing it to mask = out[:,class_idx,:,:] == torch.max(out, dim=1)[0] should fix that issue.

    class_to_color = [torch.tensor([1.0, 0.0, 0.0]), ...]
    output = torch.zeros(1, 3, out.size(-2), out.size(-1), dtype=torch.float)
    for class_idx, color in enumerate(class_to_color):
        mask = out[:,class_idx,:,:] == torch.max(out, dim=1)[0]
        mask = mask.unsqueeze(1) # should have shape 1, 1, 100, 100
        curr_color = color.reshape(1, 3, 1, 1)
        segment = mask*color # should have shape 1, 3, 100, 100
        output += segment

Thank you. That solved that issue but now I am running into a new one.

segment = mask*color # should have shape 1, 3, 100, 100
RuntimeError: The size of tensor a (100) must match the size of tensor b (3) at non-singleton dimension 3

Also, the shape of mask after mask = mask.unsqueeze(1) is [1, 3, 100, 100] instead of [1, 1, 100, 100] but I am not sure why.

I cannot reproduce that issue.
Can you check the shape of your output is expected?

>>> out = torch.randn(1, 6, 100, 100)
>>> mask = out[:,0,:,:] == torch.max(out, dim=1)[0]
>>> mask.shape
torch.Size([1, 100, 100])
>>> mask.unsqueeze(1).shape
torch.Size([1, 1, 100, 100])

Yes the shape is correct.

Here is the code:

            pred = torch.sigmoid(model(x))
            out = (pred > 0.5).float()
            print(f"out shape: {out.shape}\n")
            class_to_color = [torch.tensor([0.0, 0.0, 0.0]), torch.tensor([10, 133, 1]), torch.tensor([14, 1, 133]),  torch.tensor([33, 255, 1]), torch.tensor([243, 5, 247]), torch.tensor([(255, 0, 0)])]
            output = torch.zeros(1, 3, out.size(-2), out.size(-1), dtype=torch.float)
            for class_idx, color in enumerate(class_to_color):
                mask = out[:,class_idx,:,:] == torch.max(out, dim=1)[0]
                print(f"{mask}\n")
                mask = mask.unsqueeze(1) # should have shape 1, 1, 100, 100
                print(f"mask shape: {mask.shape}\n")
                curr_color = color.reshape(1, 3, 1, 1)
                print(f"color shape: {color.shape}\n")
                segment = mask*color # should have shape 1, 3, 100, 100
                output += segment
            torchvision.utils.save_image(output, f"{folder}/pred_{idx}.png")

Here is the corresponding output:

out shape: torch.Size([1, 6, 100, 100])

tensor([[[True, True, True,  ..., True, True, True],
         [True, True, True,  ..., True, True, True],
         [True, True, True,  ..., True, True, True],
         ...,
         [True, True, True,  ..., True, True, True],
         [True, True, True,  ..., True, True, True],
         [True, True, True,  ..., True, True, True]]], device='cuda:0')

mask shape: torch.Size([1, 1, 100, 100])

color shape: torch.Size([3])

And the error:

segment = mask*color # should have shape 1, 3, 100, 100
RuntimeError: The size of tensor a (100) must match the size of tensor b (3) at non-singleton dimension 3

Ah, curr_color should be used instead:

    class_to_color = [torch.tensor([1.0, 0.0, 0.0]), ...]
    output = torch.zeros(1, 3, out.size(-2), out.size(-1), dtype=torch.float)
    for class_idx, color in enumerate(class_to_color):
        mask = out[:,class_idx,:,:] == torch.max(out, dim=1)[0]
        mask = mask.unsqueeze(1) # should have shape 1, 1, 100, 100
        curr_color = color.reshape(1, 3, 1, 1)
        segment = mask*curr_color # should have shape 1, 3, 100, 100
        output += segment

However, note that you should make sure the color formatting is consistent (e.g., either floating point values between 0.0 and 1.0 or integers between 0 and 255).

Thank you! That solved that issue but now I am getting the following issue :face_with_monocle:

segment = mask*curr_color # should have shape 1, 3, 100, 100
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

Did one of mask or curr_color get implicitly allocated on the CPU? I thought everything gets allocated on the GPU by default.

Nope, things are allocated on the CPU by default. You can simply add device='cuda' to the torch.tensor(...) calls to fix this.

1 Like

Works like a charm. I also had to add output to the GPU.

Thank you very much I really appreciate your prompt and helpful responses! :slight_smile:

1 Like