Kernel dies when moving model to cuda

Could you run your code with gdb and try to get the stack trace as explained in this post?