KL divergence results different from numpy

I’ve been trying to implement the KL divergence using tf/pytorch and numpy. So far (tks to @Nikronic) the tf and pytorch results are similar but the numpy version is quite off, and I can not find any reason why. One thing that I’ve noticed is that if the preds and labels are containing one array repeated multiple times (np.broadcast_to(np.random.uniform(0., 1., (11,), (64, 12, 300, 11))) the results are very similar. Any help will be more than appreciated.

preds = np.random.uniform(0., 1., (64, 12, 300, 11))
labels = np.random.uniform(0., 1., (64, 12, 300, 11))


preds_tf = tf.distributions.Categorical(probs=tf.convert_to_tensor(preds))
labels_tf = tf.distributions.Categorical(probs=tf.convert_to_tensor(labels))
tf_res = tf.reduce_mean(tf.distributions.kl_divergence(preds_tf, labels_tf))

preds_torch = torch.distributions.Categorical(probs=torch.from_numpy(preds))
labels_torch = torch.distributions.Categorical(logits=torch.from_numpy(labels).log())
torch_res = torch.mean(kl_divergence(preds_torch, labels_torch))

np_res = np.mean(np.sum(preds * np.log((preds / labels)), axis=-1))
print(tf_res.numpy(), torch_res.item(), np_res)
1 Like