Is ther any friends know the field of semantic segmentation

I design a net for semantic segmentation by myself,and train on pascal voc 2012.
I crop the image (512 512),batch_size=1. I
use BatchNorm2d to promte convergence.
I use the loss function is NLLLoss2d( log.softmax (output), target).
I use SGD ,lr=0.001,momentum=0.9.
After 17 epochs the loss value very small .Then I got a good model on train set,but I got worst result when I use the model on val set,and I try the model of different loss value on val set .
I think the problem is not overfitting.
Is ther any body give me some suggestions
This link is my picture show

I haven’t checked your code yet, but BatchNorm layers with a batch size of one might lead to a bad performance. If your batch size is limited by your GPU memory, you could try InstanceNorm instead.

ok , I will try it tonight .My gpu is gtx1060,very low

I check the doc. The GroupNorm in the unstable version

Yes, if you need nn.GroupNorm, you would have to build PyTorch from source.
Have a look at this instruction to build it.
It should be pretty easy, but let me know, if you encounter any issues.

nn.InstanceNorm however is in the current stable release of PyTorch.

In my code,I use vgg16_bn as an encoder, if I use InstanceNorm2d in the decoder,The net will not convergence.The loss value always shake

May be ,the source is the code of pytorch 0.4, it do not include nn.GroupNorm , if I get the source form the link you show me.

The git clone command should clone the current master branch, where nn.GroupNorm was added.
How did you try to install it?

Oh ,I only check the code ,and I do not find nn.GroupNorm in the code torch/nn/modules.Let me see again

Sorry,I find it .Thank you for you help

I want to ask you another question.Does GroupNorm need small memory than BatchNorm?

I’m not sure, if nn.GroupNorm needs less memory than nn.BatchNorm in the layer implementation.
But in general you save memory, because you can lower the batch size up to 2 without sacrificing much accuracy.
You can see this effect in Figure1 in the paper.

thank you…