KL divergence returns NaN on mps

cafaxo · November 20, 2022, 1:34pm

I am running into the following issue:

Python 3.10.8 (main, Oct 13 2022, 09:48:40) [Clang 14.0.0 (clang-1400.0.29.102)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> ref = torch.Tensor([0., 0., 1.]);
>>> pred = torch.Tensor([-1.09861, -1.09861, -1.09861]);
>>> torch.nn.functional.kl_div(pred, ref, reduction='batchmean')
tensor(0.3662)
>>> refmps = ref.to('mps');
>>> predmps = pred.to('mps');
>>> torch.nn.functional.kl_div(predmps, refmps, reduction='batchmean').cpu()
tensor(nan)

I am using pytorch 1.13.0. Am I doing something wrong here or is this an issue I should report to github?