I’m trying to train a Pneumonia classifier using Resnet34. While training the model, the loss is increasing and accuracy is decreasing drastically (both in training and validation sets). What might be the potential reason behind this?
def train(model, dataloaders, loss, optimizer, epochs=5):
train = dataloaders['train']
valid = dataloaders['valid']
device = 'cuda' if torch.cuda.is_available() else 'cpu'
metric = Accuracy().to(device)
for epoch in tqdm(range(epochs), desc="EPOCHS : "):
model.train()
cst = 0
for x, y in tqdm(train, leave=True, desc="Trainning : "):
optimizer.zero_grad()
x = x.to(device)
y = y.to(device)
preds = model(x).to(device)
acc = metric(preds.argmax(dim=1), y)
cost = loss(preds, y)
cst += cost.item()
cost.backward()
optimizer.step()
acc = metric.compute()
cst /= len(train)
print(f'Train loss : {cst} \t Train acc : {acc}')
model.eval()
cst = 0
for x, y in tqdm(valid, leave=True, desc="Validation : "):
x = x.to(device)
y = y.to(device)
preds = model(x).to(device)
acc = metric(preds.argmax(dim=1), y)
cost = loss(preds, y)
cst += cost.item()
acc = metric.compute()
cst /= len(valid)
print(f'Valid loss : {cst} \t Valid acc : {acc}')
return model
model = models.resnet34(pretrained=True)
for param in model.parameters():
param.requires_grad = False
model.fc = nn.Sequential(
nn.Dropout(p=.7),
nn.Linear(in_features=model.fc.in_features, out_features=2),
nn.LogSoftmax(dim=1)
)
model = model.to(device)
LR = 3e-3
WD = 1e-4
loss = nn.NLLLoss()
optimizer = optim.Adam(model.parameters(), lr=LR, weight_decay=WD)
md = train(model, dataloaders, loss, optimizer, epochs=5)
Well, the obvious answer is, nothing wrong here, if the model is not suited for your data distribution then, it simply won’t work for desirable results. And another thing is I think you should reframe your question If loss increase then certainly acc will decrease.
That’s just my opinion, I may not be to the point here.
I tried different architectures as well, but the result is the same. And I don’t think I should reframe the question, as you can see from the screenshot.
Can you check the initial loss of your model with random data? It should be around -ln(1/num_classes). If this value is close then it suggests that your model is initialized properly. The next thing to check would be that your data format as input to the model makes sense (e.g., from the perspective of data layout, etc.)
From here, if your loss is not even going down initially, you can try simple tricks like decreasing the learning rate until it starts training. If the loss is going down initially but stops improving later, you can try things like more aggressive data augmentation or other regularization techniques.
@eqy I changed the model from resnet34 to renset18. The loss is stable, but the model is learning very slowly. The accuracy is starting from around 25% and raising eventually but in a very slow manner. It is taking around 10 to 15 epochs to reach 60% accuracy. I tried increasing the learning_rate, but the results don’t differ that much.
Ok, that sounds normal. At this point I would see if there are any data augmentations that you can apply that make sense for you dataset, as well as other model architectures, etc.
@eqy Ok let me explain about the project I’m working on. I’m trying to classify Pneumonia patients using X-ray copies. Below mentioned are the transforms I’m currently using.
Before you may ask why am I using Invert transform on the validation set, I think this transform is able to capture the pneumonia parts in the x-ray copies. So, I used it on validation and test set as well (If it is a bad idea the correct me). After applying the transforms the images look something like this:
Nice. @Lucky_Magna Could you please share the performance of your final model?
Like the training and validation losses plots and possibly accuracy plots as well.