consider following code will this part of computation graph correctly and gradients are computed correctly.I mean can distributions be part of graph
import torch.distributions.normal as tdn
dist = tdn.Normal(mean,std) #assume network predicts mean,std
loss = -1 * dist.log_prob(y)
I think the
log_prob is differentiable for Normal yes. You can check that by verifying that if
dist.log_prob(y).requires_grad == True.
In general, the output requires grad, that means that the function is differentiable using the autograd.
Another question, tdn.cdf(y) will also differentiable right?
I was thinking if dist = tdn.Normal(mean,std) would mess computational graph.It will not write right?