# Use KL divergence as loss between two multivariate Gaussians

Hi,

I want to use KL divergence as loss function between two multivariate Gaussians. Is the following right way to do it?

p = torch.distributions.Normal(mu1, std1)
mu2 = torch.rand((B, D))
std2 = torch.rand((B, D))
q = torch.distributions.Normal(mu2, std2)

loss = torch.distributions.kl_divergence(p, q).mean()
loss.backward()

My understanding is that torch.distributions.kl_divergence computes kl(p,q) like derivations in section 9 of this document.

any update on this question?

Hi,

You are right. When you are using distributions from `torch.distribution` package, you are doing fine by using `torch.distribution.kl_divergence`. But if you want to get `kl` by passing two tensors obtain elsewhere, you can do following approach:

This is the `kl` between two arbitrary layers.
Just be aware that the input `a` must should contain `log-probabilities` and the target `b` should contain `probability`.