Hi all
I’m trying to use F.kl_div as a memory-efficient way of calculating the KL between two distributions during the forward pass of my model as a form of regularisation. I had previously rolled my own implementation, but I noticed that PyTorch has a C-under-the-hood version already, so I swapped that in.
However, trying to use it in my forward pass gives the error in the title. Is there a way around this?
Thanks
Kris