torch.distributions.MultivariateNormal.log_prob throws RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED for large batch sizes

A similar problem was reported here. The fix there was to repeat the covariance matrices.