Hello everyone,
I am implementing multivariate normal distribution with auto gradient, I manually wrote backward pass. When using gradcheck.py to check if I calculate the gradients right, I found out when the inputs were on CPU, the test cases always passed while on GPU, the test cases failed. Later, I looked into the gradcheck.py
, and found out the analytical gradient became all zeros which means after doing .backward()
, the gradients in my variables are all None. However, in my last test cases, it shows that the backward function of this self-defined operation works fine with cudaTensor.