I hope everyone is staying safe and isolating as much as possible/required…
I have a problem with a 1D CVAE i am creating. no matter what I do after 38 or 39 epochs I always get a NaN value in the loss function when using the BCEwithLogitsLoss.
I have a feeling that it is to do with the loss function calculation or i am doing something wrong in setting the model up, but i really cannot figure it out.
def sample(self, eps=None): if eps is None: eps = torch.randn(1, self.lat_dim) print("eps=", eps) return self.decode(eps, apply_sigmoid=True) def loss_fn(model, data): mean, logvar = model.encode(data) z2=model.reparm(mean,logvar) out=model.decode(z2) criterion = torch.nn.BCELoss(size_average=False,reduce=False, reduction='sum') #criterion = torch.nn.BCEWithLogitsLoss(size_average=False,reduce=False, reduction='sum') BCE=criterion(out,data) logpx_z=-torch.sum(BCE,[1,2],keepdim=False) #logpx_z=-torch.sum(BCE,2,keepdim=True) logpz=log_normal_pdf(z2,torch.tensor(0.),torch.tensor(0.)) logqz_x=log_normal_pdf(z2, mean, logvar) mean=logpx_z+logpz-logqz_x loss=-torch.mean(mean) return logvar,mean,loss,out,logqz_x,logpz,logpx_z,z2
data = data.cuda() optimizer.zero_grad() logvar,mean,loss,out,logqz_x,logpz,logpx_z,z2 = loss_fn(model, data) loss.backward() optimizer.step()
as a quick asside the loss function starts to drop to zero but never drops below 300 and always has a grad_fn=
I have wondered if the problem is in loss.backward() & optimizer.Step()…
I hope someone can spot the error…
Many thanks & stay safe everyone