How to compute KL divergence for other distributions?

It seems like your formula is correct but one thread addresses your issue:

It could be that you get wrong results because RelaxedOneHotCategorical is numerical instabil. The linked thread also provides a solution for that.