I want to compute a the KL divergence between 2 batches of distributions.
x is my tensor with predicted distributions and
target contains the target distributions. The shape of both
(batch_size, max_dist_size). Each row in
x and the
target contains a distribution whose support is
n <= max_dist_size. I also have a list of
I am currently considering doing something like this:
criterion = nn.KLDivLoss(size_average=False) l = 0. for i in range(x.size(0)): l += criterion(x[i, :dist_size[i]].unsqueeze(0), target[i, :dist_size[i]].unsqueeze(0))
Is there a better way to use the
dist_size list to mask and obtain the KL divergence over the entire batch?