Bayesian computation in PyTorch

junpenglao · February 27, 2017, 7:01am

Hi all,

Just discover PyTorch yesterday, the dynamic graph idea is simply amazing! I am wondering if anybody is (or plans to) developing a Bayesian Computation package in PyTorch? Something like PyMC3 (theano) or Edward (tensorflow). I think the dynamic nature of PyTorch would be perfect for dirichlet process or mixture model, and Sequential Monte Carlo etc.

smth · February 27, 2017, 5:10pm

from keeping tabs on the community, I’m not aware of anyone working on a PyMC3 / Edward -like package. Would love to see how far you get, let us know how it goes

stepelu · February 28, 2017, 2:11pm

I just recently started to work on this in my spare time.
My plan is to:

implement a selected number of variants of inference algorithms such as variational and MCMC inference (both traditional and scalable versions) separately from anything neural-network related
make such framework easily applicable to the existing nn module
implement basic functionalities for common statistical distributions
I will post an update when I have something to show.

apaszke · February 28, 2017, 3:02pm

@stepelu great! let us know if you need any help.

junpenglao · February 28, 2017, 3:43pm

I would love to contribute. Did you start a repository somewhere?
For the implementation of basic probability distribution, is it possible to just wrap the Scipy function: https://github.com/pytorch/tutorials/blob/master/Creating%20extensions%20using%20numpy%20and%20scipy.ipynb

stepelu · March 1, 2017, 1:51pm

No yet, I’ll post the link here when I do.
And unfortunately that approach wouldn’t work for CUDA.

junpenglao · March 1, 2017, 5:06pm

I see, thanks for the info - better implementing them from scratch then.

AjayTalati · May 29, 2017, 3:54pm

Just bumping this old thread to see if anyone’s working on variational inference using PyTorch?

Been using Edward recently to do deep VI, and its great apart from having the usual TensorFlow disadvantages

smth · May 29, 2017, 4:18pm

the folks at Uber may be building out an Edward-like thing for PyTorch.
Noah Goodman is at Uber ( he built http://webppl.org/ ).

AjayTalati · May 29, 2017, 4:36pm

Wow, Noah Goodman sold out? Never thought that would happen? Guess every man has his price!

Then again I guess Uber will give him an army of engineers at his disposal Hope they open-source it?

Now that you mention Uber - yes I remember they’ve been working on Bayesian Optimization for their massive routing problems a long time. I see they signed up Zoubin Ghahramani as head scientist too.

I think principled Bayesian computation, overcomes many of the deficiencies of deep learning, and vice versa. Check out Shakir’s NIPs 2016 slides

http://shakirm.com/slides/NIPS2016-Bayesian%20Agents.pdf

AjayTalati · May 30, 2017, 11:25pm

Hey @smth,

please post here/let me know, when Uber or anyone else make public some sort of Black-Box variational inference engine public for PyTorch.

Tensorflow is driving me nuts - once you’ve used PyTorch it’s painful to go back to TF!

It doesn’t look too unfriendly porting over some of the code from, Edward.

For example, the main KL inference routines, are well written and there’s not much TF dependency, see

github.com

blei-lab/edward/blob/master/edward/inferences/klpq.py

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import six
import tensorflow as tf

from edward.inferences.variational_inference import VariationalInference
from edward.models import RandomVariable
from edward.util import copy, get_descendants

try:
  from edward.models import Normal
except Exception as e:
  raise ImportError("{0}. Your TensorFlow version is not supported.".format(e))


class KLpq(VariationalInference):
  """Variational inference with the KL divergence

This file has been truncated. show original

yinhao · June 1, 2017, 12:22pm

There is another Bayesian generative models lib, zhusuan, built on TF.

AjayTalati · June 1, 2017, 4:04pm

Hi @yinhao,

thank you very much - I haven’t seen this library before, it looks very up to date and useful.

If you are working with Gaussian Processes, another very useful library is GPflow,

So it seems that there are now three variational inference libraries built upon Tensorflow by three different research groups, (Blei Lab, Tsinghua Machine Learning Group, and various contributors to GPflow)?

I guess it’s now only a matter of time before something is available in PyTorch

Kind regard,

Ajay

stepelu · June 8, 2017, 9:52am

I started working on different topics shortly after sending my last message so I didn’t make much progress yet.

Now, however, I am back on a project which involves generative models and inference so I expect I’ll have more time to be working on this.

I created scilua before, so I can capitalize on that and start with statistical distributions, followed by HMC (basic and NUTS) and then variational methods.

Before proceeding there are a few preliminary points I’d like to discuss:

What is the plan for stochastic graphs?

For unbiased gradients, pathwise-type estimators come for free.

Other than that, my understanding is that the currently supported approach is via reinforce().

I also found stochastic.py where the score-type estimators are implemented for some distributions.

However, both of these are “local” approaches, and it doesn’t seems to me that the current framework would allow for the automatic implementation of unbiased gradient estimators for more complex cases, say example 2 in section 2.3 of stochastic computation graphs, without modifications.

Classes for statistical distributions?

Assuming that there is a plan to fully support stochastic graphs in the future it would make sense to implement distributions as classes instead of separate methods (Normal().log_pdf() vs normal_log_pdf()). Parameters would be passed to the constructor instead of passing them to every member function call.

Separate gradient estimators?

I would keep the logic related to gradient computations separated, via an option passed to the constructor or a wrapping class (Normal(gradient_estimator=PathwiseEstimator()) or PathwiseEstimator(Normal())) to retain flexibility as there are many different ways to produce such estimators.

This can introduce some issues as tensors are not currently promoted to autograd variables but I assume this will be done in the future.

A cuda named argument would be passed to the constructor of statistical classes to specify that operations like random number generation are done on the GPU (which).

Assumptions on data shapes?

It might be beneficial to assume [batch, random_variable.size()] dimensions: the first dimension has an iid samples meaning that gets averaged over log-likelyhood computations, and it gives space to generate multiple samples.

Basically I’m looking for the PyTorch’s core team comments / design suggestions to the points I mentioned above, and on the ones I failed to consider!

joh4n · June 12, 2017, 7:47am

I am interested in contributing. I’m currently looking for a good package to test VI algorithms for big models and pytorch’s dynamic graph would make it a lot easier than working in theano or tensorflow.

stepelu · June 23, 2017, 6:32am

I started putting something online at https://github.com/stepelu/ptstat, for now just some statistical distributions and a small number of related functions.

I’ll implement the Gamma, Beta and Dirichlet distributions too next week, sampling aside that would need a C/CUDA implementation (and it’s not trivial to do CUDA-efficiently and in a numerically stable way).

I am also working on MCMC: Metropolis-RW, Langevin-RW, HMC, NUTS.

Contributions are welcome, for instance:

not yet implemented statistical distributions (let me know if you plan to work on any of the above)
add missing kl-divergence implementations
add mean and variance methods

Is there any timeline for issue #274? It would allow me to remove all these Variable() calls and facilitate the MCMC code I’m writing too.

emilemathieu · August 6, 2017, 6:50pm

I’m also highly interested to contribute ! I have already contributed to edward and Turing by implementing various MCMC methods.

Andres_Masegosa · September 19, 2017, 12:07pm

Any advances? Highly interested too.

ngoodman · November 6, 2017, 3:43pm

Hi Bayes folks, It’s great to see people interested in this approach! We recently released a probabilistic programming language built on top of PyTorch, called Pyro PPL. Hopefully this will be of use / interest!

tom · November 6, 2017, 4:19pm

For those with an inclination towards gpflow and pytorch, I did some work on a port at https://github.com/t-vi/candlegp . Of course, it is much more modest in ambition than pyro.

Best regards

Thomas