Use KL divergence as loss between two multivariate Gaussians

Hi,

I want to use KL divergence as loss function between two multivariate Gaussians. Is the following right way to do it?

mu1 = torch.rand((B, D), requires_grad=True)
std1 = torch.rand((B, D), requires_grad=True)
p = torch.distributions.Normal(mu1, std1)
mu2 = torch.rand((B, D))
std2 = torch.rand((B, D))
q = torch.distributions.Normal(mu2, std2)

loss = torch.distributions.kl_divergence(p, q).mean()
loss.backward()

My understanding is that torch.distributions.kl_divergence computes kl(p,q) like derivations in section 9 of this document.

any update on this question?

Hi,

You are right. When you are using distributions from torch.distribution package, you are doing fine by using torch.distribution.kl_divergence. But if you want to get kl by passing two tensors obtain elsewhere, you can do following approach:

@Rojin I have posted this on your thread actually.

This is the kl between two arbitrary layers.
Just be aware that the input a must should contain log-probabilities and the target b should contain probability.

https://pytorch.org/docs/stable/nn.functional.html?highlight=kl_div#kl-div

By the way, PyTorch use this approach:


https://pytorch.org/docs/stable/distributions.html?highlight=kl_div#torch.distributions.kl.kl_divergence

Good luck
Nik