I am working on an implementation that computes the exponential moving average (EMA) for bias correction for the MINE estimator. However, the backward function in EMALoss is not called and not gradient is computed. Any idea why?
Thanks Frank. Well, maybe I should have been more explicit. I use MINE as a loss (x,z, are inputs and T is an embedding network) - it is a mutual information estimator. In this conjunction EMAloss acts as a helper, and does not work as expected.
It is not clear what you are doing here. (For example, the forward()
method of your MINE class uses self.T, but self.T is never defined.)
Could you simplify your code down to the bare essentials that reproduce
your issue and then post the simplified version as a complete, runnable
script?
I observed that the combination of APEX 16-bit with combination of torch.autograd.Function for loss computation is creating very weird behavior, e.g., inconsistent matrix caching, no gradients. Optimization with 32-bit or disabling autocasting autocast(enabled=False) fixed the problem for me…