Training, learning image target

mortezamg63 · November 27, 2017, 6:39am

Hello
There is a big question in my mind. I do not know what I must do in this situation.
In Artificial Nueral Networks we use loss value of the last layer in order to use it for optimization because the goal is reducing loss value. Now, consider these problems:
1- I want to train the network for segmenting different objects in input image
2- I want to train the network for segmenting skin colors
3- I want to train the network for finding a hand in input image
there are many other subjects to train the networks.

It can be considered that loss functions among the three problems are different but loss functions based on their target return different values. It seems to me that we can consider one deep network and train the three problems with different data and loss functions (regarding the problem). The only difference seems different loss values that is used by backpropagation method for updating weights. am I think correctly?

Because in websites we see problem of classification, which target is class number and output of network is not an image. How must the loss function is considered in this situation?
It seems that I must consider target an labeled image (ground truth) and output of the last layer as output of trained network (which is an image that is called prediction). But I do not this is enough to compute loss function based on prediction and ground truth image? can returned value converge the network to segment hand, skin color or objects or I must consider other factors?

Venkatesh_Sakthivel · September 15, 2018, 7:10pm

+1 Even I have similar kind of problem.

ptrblck · September 15, 2018, 8:10pm

Segmentation tasks can be seen as a pixel-wise classification, if each pixel can only have one single class.
Similar to a vanilla classification we predict the classes for the input, but for each pixel instead of the whole image.
The loss functions like nn.CrossEntropyLoss work for multi-dimensional output.
Here is a small example using a dummy model with just one conv layer:

n_classes = 2
c, h, w = 3, 24, 24

model = nn.Conv2d(c, n_classes, 3, 1, 1)
criterion = nn.CrossEntropyLoss()

x = torch.randn(1, c, h, w)
target = torch.empty(1, h, w, dtype=torch.long).random_(n_classes)
output = model(x)
loss = criterion(output, target)