Does distributions.Categorical use REINFORCE for the gradient calculation?

mjas · November 7, 2018, 8:52am

I use the output of` torch.distributions.Categorical.sample() internally in my computation graph, not as the final step. (My sampled action affects the update of a state variable.) What is the default gradient calculation that PyTorch applies in this case please? REINFORCE or Path Gradient or nothing?

albanD · November 7, 2018, 10:25am

Hi,

The default is nothing as far as I know, the returned samples will have requires_grad=False.

mjas · November 7, 2018, 11:18am

Thanks very much. So I guess we can only learn the parameters of discrete distributions if they are at the output layer.