I wonder if the devs have any specific advices on training with half precision?
I converted my model to run with cuda().half(), but it seems to not be able to converge.
Is there something I should be aware of?
Thank you!
I wonder if the devs have any specific advices on training with half precision?
I converted my model to run with cuda().half(), but it seems to not be able to converge.
Is there something I should be aware of?
Thank you!
This week Amazon has set up AWS P3 instances with Tesla V100 cards, which support half-precision training, so I am awakening this old topic.
I only have experience with fp training on Titan X. If someone has insight on how to train with half-precision on Tesla V100 or P100 cards, please share them with us!
@FuriouslyCurious, did you manage to run anything at all on the v100?
Please see here for tips on training with mixed precision http://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html
Those tips are very interesting. Has anyone got examples of implementing that in pytorch? Are there any examples of successfully training in half precision with pytorch - especially of standard architectures like resnet?