Good morning,
I hope everyone is staying safe and isolating as much as possible/required…
I have a problem with a 1D CVAE i am creating. no matter what I do after 38 or 39 epochs I always get a NaN value in the loss function when using the BCEwithLogitsLoss.
I have a feeling that it is to do with the loss function calculation or i am doing something wrong in setting the model up, but i really cannot figure it out.
Loss Function
def sample(self, eps=None):
if eps is None:
eps = torch.randn(1, self.lat_dim)
print("eps=", eps)
return self.decode(eps, apply_sigmoid=True)
def loss_fn(model, data):
mean, logvar = model.encode(data)
z2=model.reparm(mean,logvar)
out=model.decode(z2)
criterion = torch.nn.BCELoss(size_average=False,reduce=False, reduction='sum')
#criterion = torch.nn.BCEWithLogitsLoss(size_average=False,reduce=False, reduction='sum')
BCE=criterion(out,data)
logpx_z=-torch.sum(BCE,[1,2],keepdim=False)
#logpx_z=-torch.sum(BCE,2,keepdim=True)
logpz=log_normal_pdf(z2,torch.tensor(0.),torch.tensor(0.))
logqz_x=log_normal_pdf(z2, mean, logvar)
mean=logpx_z+logpz-logqz_x
loss=-torch.mean(mean)
return logvar,mean,loss,out,logqz_x,logpz,logpx_z,z2
Model Activation
data = data.cuda()
optimizer.zero_grad()
logvar,mean,loss,out,logqz_x,logpz,logpx_z,z2 = loss_fn(model, data)
loss.backward()
optimizer.step()
as a quick asside the loss function starts to drop to zero but never drops below 300 and always has a grad_fn=
I have wondered if the problem is in loss.backward() & optimizer.Step()…
I hope someone can spot the error…
Many thanks & stay safe everyone
chaslie