I am trying to implement Fully convolution Networks (FCN) for semantic segmentation task on Pascal VOC 2012 dataset. I am having trouble regarding loading of dataset. My doubts are as follows:
- Since in FCN-32 we get the output with dimension as = H x W x num_classes. Does this mean I have to convert my ground-truth segmentation maps to H x W x num_classes ? If yes, how can I generate one-hot encoded ground-truth images ?
- What is the loss function that is used in such type of task apart from IOU, can categorical crossentropy be used in such situation ?
1, No, Load ground truth in H * W and let your network output H * W * num_classes
2, cross entroy loss can be OK(dense pixel-level classification problems)
But if I keep network output to be H x W x num_classes and ground truth to be H x W it gives me error that y_pred should be of same shape as y_target
@lxtGH Thanks for the advice it worked. Now I was able to run the entire network, but I was getting a very high loss (~3.0 to 1.8). Also since the output is of shape = H * W * num_classes, how should I plot this prediction to visualize my predictions?
First cat H * W * num_classes in H * W map by argmax, according the definition of semantic segmentation, each pixel represent a class, your can put each pixel with different color(RGB), one color represent a class
So does this mean applying argmax on the ‘num_class’ dimension to get H*W image and then convert it into RGB image ?
yes，you are right. Put the color according to the num_class
@lxtGH but argmax will give only 1 Max value among all 21 values. How would I arrange them ?
@lxtGH I used
res.argmax(-1) given that
res is my prediction of 21 channels. The outputt generated from
res.argmax(-1) is 0 and get the output in following way:
This is my model structure for FCN-32:
These are my hyper-parameter settings:
Have I build the model correctly? why am I getting 0 predictions?
How many iterations have you trained the model? I was once in the similar situation but after enough epochs, the model started to output segmentations.
@kaixin I am running this model for 1 to 4 epochs, should I increase it to more epochs. What might be optimal number of epochs to train ?
@keyur_paralkar The number of epochs mentioned in the paper is 175. Try 50 or more, I think you will get something rather than a blank mask.
@kaixin and @lxtGH Thank you guys I am able to get predictions with FCN-32 model now.