KL-divergence between two multivariate gaussian

I have two multivariate Gaussian distributions that I would like to calculate the kl divergence between them. each is defined with a vector of mu and a vector of variance (similar to VAE mu and sigma layer). What is the best way to calculate the KL between the two? Is this even doable? because I do not have the covariance matrix. If that is not doable, what if I take samples from both distributions and calculate the KL between the samplings? is that the right way of doing this? is there any build-in function for any of these calculations?

Hi,

Yes, this is the correct approach.
Just be aware that the input a must should contain log-probabilities and the target b should contain probability.

https://pytorch.org/docs/stable/nn.functional.html?highlight=kl_div#kl-div

By the way, PyTorch use this approach:


https://pytorch.org/docs/stable/distributions.html?highlight=kl_div#torch.distributions.kl.kl_divergence

Good luck
Nik

Thanks Nick for your input. I should restate my question. I have two multi-variate distributions each defined with โ€œnโ€ mu and sigma. Now I would like to do a KL between these two. Any idea how that can be done?

You can sample x1 and x2 from ๐‘1(๐‘ฅ|๐œ‡1,ฮฃ1) and ๐‘2(๐‘ฅ|๐œ‡2,ฮฃ2) respectively, then compute KL divergence using torch.nn.functional.kl_div(x1, x2).

As @Nikronic mentioned the kl_div requires a to be log-probabilities and the target b to be probability. correct me if Iโ€™m wrong, but I think sampling from the two distributions is not going to give log-probabilities and probabilities

any suggestion? I donโ€™t think that what you said can be used for sampling layer

@Rojin
Actually, I have never used distribution class but based on docs, in this situation you just need to pass them directly to kl_div from distribution if the aforementioned distributions are obtained from Distribution subclasses or you have registered your own.

Please refer to this links for more information:

  1. https://pytorch.org/docs/stable/distributions.html#torch.distributions.kl.kl_divergence
  2. https://pytorch.org/docs/stable/distributions.html#torch.distributions.distribution.Distribution

well, you know, my name is Nik, somehow abbreviated of Nikan (a Persian name) :wink:

Bests

1 Like

Hey Nikan!

:smiley: Yeah I realized that from your last name!

Sorry if Iโ€™m restating my question ( I modified it a couple of times at the top lol) and probably repeating your answer, Iโ€™m kinda new to the KL implementation in pytorch, so I really appreciate your help.

My question is :
I have two VAEs, each with a mu and standard deviation layers. Now I would like to calculate the KL between these two VAEs using either the mu and sds, or the sampling layers. This sounds to me like a multivariate gaussian KL divergence problem, so I looked at the formula and I noticed that I actually need the covariance matrix of q (if we assume that KL(p||q) ). And as far as I know there is no way to calculate the covariance in this case, am I correct?

@zhl515 suggested to directly calculate the KL between the two sampling layers as you said before using the bellow function, but as you said this requires probabilities which I do not have, so I think the following function cannot be used on the sampling layers, is that right? :

 torch.nn.functional.kl_div(p, q).

And based on your last comment, you are suggesting to register the distribution, and then use

torch.distributions.kl.kl_divergence(p, q)

The only problem is that in order to register the distribution I need to have the covariance matrix, and I canโ€™t obtain that because I only have mu and std. It makes me think whether that is a possible thing to do at all in neural network. In the VAE paper they donโ€™t have that problem because the are assuming that q has mean 0 and covariance I

Thank you!

You said you canโ€™t obtain covariance matrix. In VAE paper, the author assume the true (but intractable) posterior takes on a approximate Gaussian form with an approximately diagonal covariance. So just place the std on diagonal of convariance matrix, and other elements of matrix are zeros.

1 Like

Now you can compute KL-divargence of two multivariate Gaussians directly from the below formula:
1565939371(1)

Thank you that solves the problem :slight_smile: