My neural network tries to get the loss (torch.distributions.kl_divergence) between two Beta distributions (Pytorch distribution objects). I know I can do this for a single example, but can I do this for a whole batch of distributions in a single object? Right now the code looks like:

```
alpha_actual = torch.Tensor(output_scores[:,0])
beta_actual = torch.Tensor(output_scores[:,1])
actual_distrib = Beta(alpha_actual, beta_actual)
alpha_y = torch.Tensor(y_batch[:,0])
beta_y = torch.Tensor(y_batch[:,1])
y_distrib = Beta(alpha_y, beta_y)
loss = torch.distributions.kl_divergence(actual_distrib, y_distrib).mean()
```

…where `alpha_actual`

and `beta_actual`

are `torch.Size([batch_size])`

. My code doesn’t break or anything when I do this, but I’m having trouble wrapping my head around how the distribution object works if I feed it a batch of parameters instead of a single parameter. Does it actually store a batch of distributions in the single object? Or am I doing this wrong?