KL Divergence produces negative values

They still aren’t distributions. :slight_smile:
Keep in mind that the loss functions take batches. So you’d want to unsqueeze(0).

Best regards

Thomas