Proper Sampling Strategy

Hello good people! I am a bit stuck with the following problem, if anyone can kindly provide some feedback, I’d really appreciate it!

Let’s assume that we are implementing a probabilistic model (i.e. the encoder of a VAE) that takes an image and predicts stochastic latent vector. To be concrete, let us say that the batch size is 16 and latent variable dimension is 32. Hence, the final layer of this network would have 64 units (32 for the mu, 32 for the log_var). And for one batch input, the output shape will be 16x64. Chunking this output into two parts, we get mu (16x32) and log_var (16x32).

We want to draw samples from the stochastic variables that must allow gradient flow.

Current Solution I:

sample = mu + torch.exp(log_var)*torch.randn(16, 32)

Current Solution II:

from torch.distributions import Normal

dist = Normal(loc=torch.zeros(32), scale=torch.eye(32))
sample = mu + torch.exp(log_var)*dist.sample(sample_shape=torch.Size([16, 32]))

Alternative Solution … ?

dist = torch.distributions.Normal(loc=mu, scale=torch.exp(log_var))
sample = dist.rsample()


  1. Are these current solutions correct? Are they identical?
  2. Can I get an alternative solution with the usage of rsample() or may be torch.distribution.Independent? That is can we simply plug-in the predicted mu and sigma into one of the built-in pytorch distributions and sample from that?