Hey
I’m training a model that has to predict images as three different classes, but my model, during training, is just predicting 1. I don’t know what’s going on, I’ve tried a lot of things but nothing helps. The backward prop, seems ok, the hyperparameters seems ok as well.
I’m using an EfficientNEtB0 pretrained model as base model and freezing its layers. I’m pruning it to get the first 6 MBConvBlocks and fiine tuning it with GAP and FC on the head, as follows:
class EfficientNet_SW(nn.Module):
"""
EfficientNet fine tuned to share weights between two inputs
IN: 2x (256x512x2) images
OUT: 1x3 logits divided in three classes
"""
def __init__(self, model_name = 'efficientnet_sw', num_blocks = 6, num_classes=3):
super(EfficientNet_SW, self).__init__()
self.base_model = EfficientNet.from_pretrained('efficientnet-b0', in_channels = 2)
self.model_name = model_name
if num_blocks < 0: # get the whole net
self.model = self.base_model
else: # get the defined number of blocks
# Last Conv block layer (gets the parameters of the prunned model)
self.model = nn.Sequential(*list(self.base_model.children())[:2], *list(self.base_model._blocks.children())[:num_blocks+2])
self.gap = nn.AvgPool2d((32, 32))
print(self.model)
self.classifier_layer = nn.Sequential(
nn.Linear(80 , 64),
nn.BatchNorm1d(64),
nn.Dropout(0.2),
nn.Linear(64 , num_classes)
)
# Freeze those weights
for p in self.model.parameters():
p.requires_grad = False
def forward(self, x1, x2):
# Sharing weights
x1 = self.model(x1)
x2 = self.model(x2)
# Concatenating the result
x = torch.cat((x1, x2), dim=2)
# Pooling and final linear layer
x = self.gap(x)
# Flatten
x = x.view(x.size(0), -1)
x = self.classifier_layer(x)
return x
As you can see, I’m sharing weights between two inputs, which I think that works fine.
The problem is, during the training loop, my model just ‘learns’ to predict 1 as output class and the accuracy in the validation set is always 50%, because it’s 50% class 0 and 50% class 1. My training loop is working as follows:
# Training step
model.train()
with tqdm(train_dl, unit="batch") as tepoch:
for i, data in enumerate(tepoch, 0):
# zero the parameter gradients
optimizer.zero_grad()
tepoch.set_description(f"Epoch {epoch}")
# Transfer to GPU
image, label = data['image'], data['label']
# Divide in 2 lungs
lung1, lung2 = image[0], image[1]
lung1, lung2 = lung1.to(device), lung2.to(device)
# One label to each lung (it's the same)
label1, label2 = label[0], label[1]
label1, label2 = label1.to(device), label2.to(device)
# forward + backward + optimize
output = model(lung1, lung2)
loss = criterion(output, label1)
loss.backward()
optimizer.step()
The output is being compared with label1 because label1 and label2 are the same.
I’m stuck here for few days and I don’t know what’s wrong. I would be so glad if someone could help me.
Thanks