Implementing a Multivariate Normal Distribution in LibTorch

Alio · August 11, 2022, 6:55pm

I’m trying to write a Multivariate Normal Distribution with rsample() in LibTorch because LibTorch has no distributions yet.
I’m not sure I have what it takes from the source code. I think I am having a problem with my Multivariate Normal Distribution knowing what to do with a covariance matrix instead of a single number standard deviation.
I’m also having issues recreating torch.distributions.MultivariateNormal in Torch.

The Python problems
This is my attempt to write a Multivariate Normal Distribution in Torch, in Python. I’m not getting errors but I am getting a matrix with the size torch.Size([2, 2]) from this. I get a torch.Size([2]) as output when input the exact same tensors into torch.distributions.MultivariateNormal and take a sample or an rsample.

µ = torch.tensor([-0.5, 2.0])
cov = torch.tensor([[2.0, 8], [2.0, 40]])
pos_cov = cov @ cov.T 
Σ = torch.linalg.cholesky(pos_cov)

ε = torch.randn(1)

torch.det(torch.pi * 2 * Σ)**1/2 * torch.exp(torch.tensor(-1/2) * (ε - µ).T * Σ**-1 * (ε - µ))

Outputs a tensor like

tensor([[1182.5215,    0.0000],
        [1245.9080, 1102.0341]])

MultivariateNormal(µ, pos_cov).sample()
Outputs a tensor like
tensor([ 7.6214, 37.2902])

The LibTorch problems
I don’t see anything about a covarience matrix or Multivariate Normal Distribution in the CUDA DistributionTemplates.cu file

I know my Python implimentation of the Multivariate Normal Distribution doesn’t match the torch.distributions.MultivariateNormal function, but I tried implementing what I have in LibTorch.

#include <torch/torch.h>

class MultivariateNormalx{
    torch::Tensor mean, stddev, var;
public:
    MultivariateNormalx(const torch::Tensor &mean, const torch::Tensor &std) : mean(mean), stddev(std), var(std * std) {}

    torch::Tensor rsample() {
        auto device = torch::cuda::is_available() ? torch::kCUDA : torch::kCPU;
        auto eps = torch::randn(1).to(device);
        auto cholesky = torch::linalg::cholesky(torch::matmul(stddev.transpose(0, 1), stddev));
        auto pi = torch::tensor(3.141592653589793);
        return pow(torch::det(pi * 2 * cholesky), 1/2) * torch::exp(-1/2 * (eps - mean).transpose(0, 1) * pow(cholesky, -1) * (eps - mean));
    }
};

torch::Tensor μ = torch::tensor({{-0.5, 2.0}, {-0.5, 2.0}});
torch::Tensor cov = torch::tensor({{2.0, 8.0}, {2.0, 40.0}});

int main () {

    std::cout << MultivariateNormalx(μ, cov).rsample();

}

Always returns…

 1 -nan
 1  1
[ CPUFloatType{2,2} ]

GMXeon · February 8, 2023, 10:31pm

I know this post is somewhat old, but I had to implement this distribution and came across this post. There’s a few fixes that need to be done to OP’s function.

Fix one: Matrix multiplication is not commutative, unlike regular multiplication. The order matters, and the matrices were accidentally supplied backwards.

Fix two: OP was a tad careless with their divisions or their brain was stuck in Python mode and forgot C will gladly do integer division without floating-point promotion where it can. As such, both divisions of constant integers resulted in zero because all remainder was discarded by integer division. Add a “.0” suffix to each constant to force floating-point division, or skip the compile-time division entirely and supply 0.5 and -0.5 respectively.

Suggestion one: Drop the ‘var’ property, it’s not used. Probably left in from debugging.

Suggestion two: Probably want to rename both pow(…) to at::pow(…) to disambiguate in case the std namespace is being used.

Suggestion three: Use M_PI (from cmath) for pi instead of defining your own inline constant, and the instantiation of the pi tensor should probably just be inlined since it’s only used once. It’s a shame that torch::pi doesn’t existing in libtorch, really odd that they omitted it.