Model just predicts one class during training and validation


I’m training a model that has to predict images as three different classes, but my model, during training, is just predicting 1. I don’t know what’s going on, I’ve tried a lot of things but nothing helps. The backward prop, seems ok, the hyperparameters seems ok as well.

I’m using an EfficientNEtB0 pretrained model as base model and freezing its layers. I’m pruning it to get the first 6 MBConvBlocks and fiine tuning it with GAP and FC on the head, as follows:

class EfficientNet_SW(nn.Module):

        EfficientNet fine tuned to share weights between two inputs
        IN: 2x (256x512x2) images
        OUT: 1x3 logits divided in three classes

    def __init__(self, model_name = 'efficientnet_sw', num_blocks = 6, num_classes=3):
        super(EfficientNet_SW, self).__init__()
        self.base_model = EfficientNet.from_pretrained('efficientnet-b0', in_channels = 2)
        self.model_name = model_name
        if num_blocks < 0: # get the whole net
            self.model = self.base_model
        else: # get the defined number of blocks 
            # Last Conv block layer (gets the parameters of the prunned model)
            self.model = nn.Sequential(*list(self.base_model.children())[:2], *list(self.base_model._blocks.children())[:num_blocks+2]) = nn.AvgPool2d((32, 32))
        self.classifier_layer = nn.Sequential(
            nn.Linear(80 , 64),
            nn.Linear(64 , num_classes)

        # Freeze those weights
        for p in self.model.parameters():
            p.requires_grad = False

    def forward(self, x1, x2):
        # Sharing weights
        x1 = self.model(x1)
        x2 = self.model(x2)

        # Concatenating the result
        x =, x2), dim=2)

        # Pooling and final linear layer
        x =

        # Flatten
        x = x.view(x.size(0), -1)
        x = self.classifier_layer(x)

        return x

As you can see, I’m sharing weights between two inputs, which I think that works fine.

The problem is, during the training loop, my model just ‘learns’ to predict 1 as output class and the accuracy in the validation set is always 50%, because it’s 50% class 0 and 50% class 1. My training loop is working as follows:

        # Training step
        with tqdm(train_dl, unit="batch") as tepoch:
            for i, data in enumerate(tepoch, 0):
                # zero the parameter gradients

                tepoch.set_description(f"Epoch {epoch}")

                # Transfer to GPU
                image, label = data['image'], data['label']
                # Divide in 2 lungs
                lung1, lung2 = image[0], image[1]
                lung1, lung2 =,
                # One label to each lung (it's the same)
                label1, label2 = label[0], label[1]
                label1, label2 =,

                # forward + backward + optimize
                output = model(lung1, lung2)
                loss = criterion(output, label1)

The output is being compared with label1 because label1 and label2 are the same.

I’m stuck here for few days and I don’t know what’s wrong. I would be so glad if someone could help me.


Try to overfit a small dataset, e.g. just 10 samples, and make sure your model is able to do so by playing around with some hyperparameters. Once your model is able to do so, scale up the use case again.

1 Like

Thanks for the answer. When i use self.model = nn.Sequential(*list(self.base_model.children())[:2], *list(self.base_model._blocks.children())[:num_blocks+2]) am I getting the weights as well? Seems like i should use deepcopy here, but I’m not pretty sure.

The parameters would be indexed as well and you could check them in the newly created nn.Sequential module via print(dict(self.model.named_parameters())).

1 Like