Segmentation of any image

I have an image segmentation (Learned Random Walker), but I would like to generalize it.

# Init the random walker modules
rw = RandomWalker(1000, max_backprop=True)

# Load data and init
target = init_image(imagem)
seeds = init_seeds(torch.zeros(1, 60, 59).long(), target)  
diffusivities = torch.zeros(1, 2, 60, 59, requires_grad=True)

# Init optimizer
optimizer = torch.optim.Adam([diffusivities], lr=0.5)

loss = torch.nn.NLLLoss()

# Main overfit loop
for it in range(iterations + 1):
    optimizer.zero_grad()

    # Diffusivities must be positive
    net_output = torch.sigmoid(diffusivities)

    # Random walker
    output = rw(net_output, seeds)

    # Loss and diffusivities update
    m = torch.nn.LogSoftmax(dim=1)
    output_log = [m(o) for o in output]

    # ls = loss(output_log, target)
    ls = loss(output_log[0], target[0])
    ls.backward()
    optimizer.step()

Given target of dimension [1, 1, 60, 59], output will be of dimension [1, nClass, 60, 59], that is, it will be the nClass probability matrices for each region.
But for example, with this same dimension, I have the image target in the range [0, 255] with 3 regions, NLLLoss does not allow me to do this, because I would have output with dimension [1, 3, 60, 59] and several pixels greater than 3.
Is there anything you can do to get around this, or some loss function that solves this problem?

Any help is welcome, thanks.

I’m not familiar with the use case, but based on the description it seems you are using 256 classes for the target?
If that’s not the case, could you explain how the target is constructed?

target is any gray scale image, I will use this as an example: yin

Visually target contains 4 regions (classes), the fund counts. But because target has pixels of various values (including 255), NLLLoss requires that output be [1, 256, 60, 59] in size, but I would like it to be [1, 4, 60, 59]. (The loss function is not required to be NLLLoss.)

If your target only contains 4 classes, you should be able to map the values to just these 4 class indices [0, 1, 2 ,3].
Does your target contain 4 discrete values in the range [0, 255]? If not, you might need to create bins for the noisy values to map them to the needed class indices.

But isn’t it an approximation of the regions?

It depends on your input and how noisy the data is.
Assuming you have 4 classes, but these classes have a certain grayscale value with a certain amount of noise.
As long as you see clear cuts in the histogram of these images, you might just use thresholds.
If you are seeing a wide spread of the noise such that classes are overlapping, you would have to think about either reducing the noise in your image or a smoothing strategy, which tries to keep the pixels in a certain area consistent.

Based from the output you’ve posted it seems that the noise is rather small (but check it with histograms).