Simple test for mixed precision on RTX 2070?

(Eric Perbos-Brinck) #1


I’m currently on mooc, using their fastai V1 built over PyTorch 1.0.
It works like a charm on a 1080Ti + Ryzen 1700X, on Ubuntu 16.04 and Nvidia 410.73.

With the new generation of Nvidia RTX cards offering Tensor Cores, and the possibility of FP16 training via mixed-precision, I got hold of an RTX 2070.
When I try and run my usual Jupyter notebook in mixed precision on the RTX 2070, it crashes the kernel (without a specific error message to track the issue, just “The kernel appears to have died. It will restart automatically.”).

So I thought maybe the first step would be to go down one level to pure PyTorch code and run a “basic test” that would check if/how it triggers the Tensor Cores and FP16 training ?

Is that possible and if so, how should I proceed ?

Best regards,



If your Jupyter Notebook kernel just dies, you could try to download your notebook as a Python script (.py) and run it in a terminal. This will usually yield a better error message.
Note that CUDA operations work asynchronously, so you might need to run your script with:


(Eric Perbos-Brinck) #3

Thank you @ptrblck for your fast and explicit reply !

I will try your tip asap and report back.



(Eric Perbos-Brinck) #4

I run my notebook as a Python script and the error message is, right at start of the epoch:

Floating point exception (core dumped)


Thanks for information!
Could you try to get the backtrace using these commands?

As far as I’ve understood your question, the script runs fine without FP16?

(Eric Perbos-Brinck) #6

Yes, the script works fine without FP16.

I got this traceback in two pictures:

(Eric Perbos-Brinck) #7


Thanks for the backtrace.
Skimming through it, could it be you are using torch.float data somewhere in torch.half layers?
Could you post your model definition?

(Eric Perbos-Brinck) #9

This might take some time to answer as I’m using a script for Cifar10 with a wrn_22() model from the high-level fastai library (like Keras for PyTorch) in the current mooc.

(Eric Perbos-Brinck) #10

I think this repo is dedicated to explore mixed-precision with PyTorch.

I’m running the scripts with the World Language Model, and can see a slight performance boost (+15%) with --fp16 on the 2070.


Yes, apex makes sure mixed precision models work fine, i.e. potentially unsafe ops are performed in FP32, while other operations are performed using FP16.
I wanted to post this as the next suggestion, but you were faster. :wink:
Were you able to run your script with apex?
I’m not sure, how easy that would be using the wrapper.

(Eric Perbos-Brinck) #12

Maybe another tip via a ticket on PyTorch GitHub for the same

Floating point exception (core dumped)

The cause is a bug in CUdnn 7.1.4, didn’t exist in 7.1.2 and was fixed in 7.2.

When I check my current package of pytorch-nightly, it’s named “1.0.0.dev20181024-py3.7_cuda9.2.148_cudnn7.1.4_0 pytorch [cuda92]”


+1, this works like a charm
Also implements loss scaling, which I found to be necessary

On small models you’ll not see much of an uplift, but on a big imagenet model like resnet18 or resnet50 you should see ~2x the performance (at least on a V100).

Also make sure you have the latest cuDNN, they’re up to 7.3.1 now

Edit: Apex works like a charm