VAE loss function super negative training loss

eric_zhu · January 30, 2022, 6:03am

I coded my loss function following https://arxiv.org/pdf/1907.08956v1.pdf, but the training loss is extremely negative. Specifically, I’m using the negative of the log-likelihood function with MSE being my reconstruction loss. Why might this be?

My code for the loss function is:

MSELoss_criterion = nn.MSELoss()
MSE_loss = MSELoss_criterion(y_hat, tgts) 

KLDiv_loss = -0.5*torch.sum(1+log_var_q - mu_q **2 - log_var_q.exp(), dim=(2)) 
KLDiv_loss = torch.mean(KLDiv_loss) 
return -MSE_loss + KLDiv_Loss

anantguptadbl · January 30, 2022, 1:22pm

@eric_zhu If you allow negative sign on MSELoss your model will have difficulty converging, as MSME is always positive and to reduce the loss, it will just keep making the variables larger and larger, which is why you are seeing extremely negative loss

I dont think you can use MSME loss as a replacement for the ELBO loss mentioned in the paper