Help with lowering parameters of U-net model

can anyone give me some tips on how i would be able to lower the amount of parameters in the following U-net implementation. I’m having trouble with over-fitting on my training data and i would like to lower the parameters in order to see if it improves the validation data accuracy.
Layers:

layers = [
        nn.Conv2d(in_channels, middle_channels, kernel_size=3, padding=1),
        nn.BatchNorm2d(middle_channels),
        nn.ReLU(inplace=True),
        nn.Conv2d(middle_channels, out_channels, kernel_size=3, padding=1),
        nn.BatchNorm2d(out_channels),
        nn.ReLU(inplace=True)
    ]

Encoder2D

layers = [
        nn.MaxPool2d(kernel_size=downsample_kernel),
        nn.Conv2d(in_channels, middle_channels, kernel_size=3, padding=1),
        nn.BatchNorm2d(middle_channels),
        nn.ReLU(inplace=True),
        nn.Conv2d(middle_channels, out_channels, kernel_size=3, padding=1),
        nn.BatchNorm2d(out_channels),
        nn.ReLU(inplace=True)
    ]

Center2D

layers = [
        nn.MaxPool2d(kernel_size=2),
        nn.Conv2d(in_channels, middle_channels, kernel_size=3, padding=1),
        nn.BatchNorm2d(middle_channels),
        nn.ReLU(inplace=True),
        nn.Conv2d(middle_channels, out_channels, kernel_size=3, padding=1),
        nn.BatchNorm2d(out_channels),
        nn.ReLU(inplace=True),
        nn.ConvTranspose2d(out_channels, deconv_channels, kernel_size=2, stride=2)
    ]

Decoder2D

layers = [
        nn.Conv2d(in_channels, middle_channels, kernel_size=3, padding=1),
        nn.BatchNorm2d(middle_channels),
        nn.ReLU(inplace=True),
        nn.Conv2d(middle_channels, out_channels, kernel_size=3, padding=1),
        nn.BatchNorm2d(out_channels),
        nn.ReLU(inplace=True),
        nn.ConvTranspose2d(out_channels, deconv_channels, kernel_size=2, stride=2)
    ]

Last2D

layers = [
        nn.Conv2d(in_channels, middle_channels, kernel_size=3, padding=1),
        nn.BatchNorm2d(middle_channels),
        nn.ReLU(inplace=True),
        nn.Conv2d(middle_channels, middle_channels, kernel_size=3, padding=1),
        nn.BatchNorm2d(middle_channels),
        nn.ReLU(inplace=True),
        nn.Conv2d(middle_channels, out_channels, kernel_size=1),
        nn.Softmax(dim=1)
    ]

You could start by reducing the number of filters in the conv layers (and thus also the number of features in the batchnorm layers, if necessary).

1 Like

Would I still be able to use pre-trained weights trained on the network with a larger amount of parameters?

You won’t be able to load the state_dict anymore, as it would create shape mismatches.
However, you could keep the original model (with the larger parameter set), load the state_dict, and reduce the number of parameters later.
The main question would be: how would you like to reduce the number of filters using the pre-trained model:

  • remove half of the filters (lower, higher, random index?)
  • reduce the filters somehow (mean, median, sum?)
1 Like

Okey. The U-Net CNN model i’m using with pre-trained weights is trained on colored Arabidopsis Thaliana plants.In my project i’m training the model with grey scale images instead, but the model seems to over-fit on my data. That’s why I intend to lower the amount of parameters. What would be your tips in doing so ?
I’m trying out some dropout at the moment.
Would freezing some layers during the training be beneficial ?

Thanks for the help.