Reinforce weighted sampling

mordith · May 22, 2017, 1:57pm

If I understand correctly if we use a sampling method (i.e. torch.multinomial) and then use the .reinforce() method we will backpropagate the reward to whatever process that created the samples.

My question is whether we can create a weighted sampler and use the reinforce to update this sampler as well

smth · May 28, 2017, 5:09pm

yes you can do that (I think).
See how the stochastic nodes are implemented:
https://github.com/pytorch/pytorch/blob/master/torch/autograd/_functions/stochastic.py