Training half-precision models with full-precision gradients

Hi,
I’m trying to train a half-precision model with Adam optimizer.
I ran into this gist implementing Adam optimizer that holds duplicates for grads and activations in full-precision for more accurate back-propagation.
Is there something similar supported in the 0.4.0 release, or will be supported in future releases? for my specific use case it worked much better than using plain half-precision gradients.