Hello, I am currently fixing this issue where my validation loss compared to train loss is absurdly high. I am training a 2 class classifier. The train part is working fine, I know that because after model is trained I test the model on the third set (the test set) and the results are as they should be. But while training and when validating the loss on validation set I get these nonsensical results that my loss is over 1000. And after every epoch it is not decreasing. What I think is that my validation code is not correct, but I don’t know hat is wrong. Here is my code:
def train(model, train_loader, valid_loader, learning_rate, learning_rate_decay_rate, epochs, device, saved_model_filepath=None):
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)
lr_labmda_1 = lambda epoch: learning_rate_decay_rate
scheduler = MultiplicativeLR(optimizer, lr_lambda=lr_labmda_1)
for i in range(epochs):
total_loss_train = 0
total_loss_valid = 0
valid_coorect_preds = 0
model.train()
for images, labels in train_loader:
images = images.to(device)
labels = labels.to(device)
preds = model(images)
optimizer.zero_grad()
loss = criterion(preds, labels)
loss.backward()
optimizer.step()
total_loss_train += loss.item()
model.eval()
with torch.no_grad():
for batch in valid_loader:
images = batch[0].to(device)
labels = batch[1].to(device)
preds = model(images)
loss = criterion(preds, labels)
total_loss_valid += loss.item()
scheduler.step()
print(f'epoch: {i}, total_loss_train: {total_loss_train: .2f}')
print(f'epoch: {i}, total_loss_valid: {total_loss_valid: .2f}')
print()
Train loss is decreasing and numbers are normal, but when looking to validation loss, it is very high (usually between couple of hundreds to 1500) and it is not decreasing at all. What am I doing wrong?