Is my validation loss code correct? How to early stop and prevent overfitting?

sandeep1 · March 16, 2021, 5:38am

Hi,
I am a new to pytorch. It will be helpful if you can comment on the below code, if I have done a mistake or if there is any better way to do it.
Also how to stop and save the best model?

for epoch in range(num_epochs):
  batch_idx=batch_idx+1
  train_loss = 0.
  full_val_loss=0.
  for x,y,y2 in loader:
    optimizer.zero_grad()
    x = embedding(x).to(device)
    input_size= x.shape[2]
    output1 = model(x)
    loss = criterion1(output1,y)
    loss.backward()
    
    optimizer.step()
    train_loss = train_loss + ((1 / (batch_idx + 1)) * (loss.data - train_loss))

  model.eval()
  with torch.no_grad():
    for x,y in val_loader:
      x = embedding(x).to(device)
      output1 = model(x)
      val_loss = criterion1(output1,y)
      full_val_loss = full_val_loss + ((1 / (batch_idx + 1)) * (val_loss.data - full_val_loss))
  
  model.train()
  print('Epoch [{}/{}], Loss: {:.4f} , val_loss: {:.4f} '.format(epoch+1, num_epochs, train_loss,full_val_loss))

Dwight_Foster · March 16, 2021, 1:59pm

What is y2 for in the train loop?

patrickwilliams3 · March 16, 2021, 6:07pm

Why no torch.no_grad()? I thought it was useful to have in evaluation mode since it prevents the autograd engine from creating the graph for the backwards pass, so you save memory when you do not need it.

patrickwilliams3 · March 16, 2021, 6:09pm

I thought that the graph is destroyed after you call .backward() and then created again next time you call forward().

Dwight_Foster · March 16, 2021, 6:11pm

Yes sorry I said that wrong. The graph is destroyed when you call .backward() using torch no grad may lower your gpu usage slightly but not a lot. I have tried both ways and for some reason my gpu memory has never really been different. You can keep it in however if it impacts your performance. I am still confused what y2 does though.