Does distributions.Categorical use REINFORCE for the gradient calculation?

I use the output of` torch.distributions.Categorical.sample() internally in my computation graph, not as the final step. (My sampled action affects the update of a state variable.) What is the default gradient calculation that PyTorch applies in this case please? REINFORCE or Path Gradient or nothing?

Hi,

The default is nothing as far as I know, the returned samples will have requires_grad=False.

Thanks very much. So I guess we can only learn the parameters of discrete distributions if they are at the output layer.