I have this simple network and I want to try a very simple idea. That is, let’s suppose that the input to the network is a pair of tensors of size
(batch_size, num_channels, dim, dim), that represent mean values and variances of
N = batch_size * num_channels * dim * dim univariate normal distributions.
forward function, I’m interested in producing a new tensor (
z) of the same size (i.e.,
(batch_size, num_channels, dim, dim)), for which each element is drawn by a univariate normal distribution with means and variances given by the corresponding elements of the above input tensors. After that I want to use variable
z just like I would use the standard
x input variable (add some convolution layers, etc.).
I’m doing this as shown below. Even though it seems to work as expected, I am not sure what will happen during back-propagation. That is, at the point where I create the variable
z (should a
.requires_grad_(True) be added there?), it seems that the connections with the inputs
x_var might be broken. After all, sampling from a distributions is not a differentiable operation…
I wonder if this is going to work during back-propagation, and if not, if you have any ideas/insight on how to implement it so it does.
import torch import torch.nn as nn class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.conv = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, padding=1) def forward(self, x_mean, x_var): z = x_var.sqrt() * torch.randn(x_mean.size()) + x_mean return self.conv(z) net = Net() x_mean = torch.randn(1, 16, 300, 300).requires_grad_(True) x_var = torch.randn(1, 16, 300, 300).requires_grad_(True) z = net(x_mean, x_var)