Segmentation fault with a K80

I encountered segmentation fault when trying to do backprop on a K80. The code ran greatly on CPU, Titan X (Pascal) and GTX1080.
The OS is Ubuntu 16.04, the cuda version is 9.0.176, and the cudnn version is 7.0.

I used gdb to debug it and got the following report:
gdb --args python main.py
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright © 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type “show copying”
and “show warranty” for details.
This GDB was configured as “x86_64-linux-gnu”.
Type “show configuration” for configuration details.
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/.
For help, type “help”.
Type “apropos word” to search for commands related to “word”…
Reading symbols from python…done.
(gdb) run
Starting program: /home/tjy/Applications/anaconda3/bin/python main.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library “/lib/x86_64-linux-gnu/libthread_db.so.1”.
[New Thread 0x7fffac8b2700 (LWP 25932)]
==> Preparing data…
Files already downloaded and verified
Files already downloaded and verified
==> Building model…
[New Thread 0x7fffa2d34700 (LWP 25937)]
[New Thread 0x7fffa21ff700 (LWP 25938)]
[New Thread 0x7fff8c7ff700 (LWP 25943)]

Thread 5 “python” received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff8c7ff700 (LWP 25943)]
0x00007ffff0eb18d5 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
(gdb)

The libcuda.so.1 points to libcuda.so.390.30.

Any thoughts?
Thanks!