Segmentation Fault nn.DataParallel

I am getting a Segmentation Fault (PyTorch 0.4.0) when I execute nn.DataParallel on a particular module
The relevant code is

model = NMTModel(nn.DataParallel(encoder, device_ids=[0]), nn.DataParallel(decoder, device_ids=[0]))

I also tried with model =nn.DataParallel( NMTModel(encoder, decoder)), but get the same error.
The same code works fine in PyTorch 0.3.1.post2. Also, the code works fine without the wrapping in nn.DataParallel.
I tried to debug with python gdb debugger, see the stack track below:

Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb) where
#0  0x0000000000000000 in ?? ()
#1  0x00007ffff7bcc410 in pthread_once () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007fffde7acfd1 in __gthread_once (__func=<optimized out>, __once=0x7fffdd5e9558 <at::globalContext()::globalContext_+408>)
    at /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/x86_64-redhat-linux/bits/gthr-default.h:699
#3  call_once<at::Context::lazyInitCUDA()::<lambda()> > (__f=<optimized out>, __once=...) at /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/mutex:746
#4  lazyInitCUDA (this=0x7fffdd5e93c0 <at::globalContext()::globalContext_>) at /opt/conda/conda-bld/pytorch_1524584710464/work/torch/lib/tmp_install/include/ATen/Context.h:55
#5  THCPModule_initExtension (self=<optimized out>) at torch/csrc/cuda/Module.cpp:321
#6  0x0000555555662bda in _PyCFunction_FastCallDict ()
#7  0x00005555556f267c in call_function ()
#8  0x0000555555714cba in _PyEval_EvalFrameDefault ()
#9  0x00005555556ec70b in fast_function ()
#10 0x00005555556f2755 in call_function ()
#11 0x0000555555714cba in _PyEval_EvalFrameDefault ()
#12 0x00005555556ec70b in fast_function ()
#13 0x00005555556f2755 in call_function ()
#14 0x0000555555714cba in _PyEval_EvalFrameDefault ()
#15 0x00005555556ec70b in fast_function ()
#16 0x00005555556f2755 in call_function ()
#17 0x0000555555714cba in _PyEval_EvalFrameDefault ()
#18 0x00005555556eba94 in _PyEval_EvalCodeWithName ()
#19 0x00005555556ec941 in fast_function ()
#20 0x00005555556f2755 in call_function ()
#21 0x0000555555714cba in _PyEval_EvalFrameDefault ()
#22 0x00005555556ebdae in _PyEval_EvalCodeWithName ()
#23 0x00005555556ec941 in fast_function ()
#24 0x00005555556f2755 in call_function ()
#25 0x0000555555714cba in _PyEval_EvalFrameDefault () 
#26 0x00005555556ebc26 in _PyEval_EvalCodeWithName () 
#27 0x00005555556ece1b in _PyFunction_FastCallDict () 
#28 0x0000555555662f5f in _PyObject_FastCallDict ()   
#29 0x0000555555667a03 in _PyObject_Call_Prepend ()   
#30 0x000055555566299e in PyObject_Call ()
#31 0x00005555556bf02b in slot_tp_init ()
#32 0x00005555556f29b7 in type_call ()
#33 0x0000555555662d7b in _PyObject_FastCallDict ()   
#34 0x00005555556f27ce in call_function ()
#35 0x0000555555714cba in _PyEval_EvalFrameDefault () 
#36 0x00005555556eba94 in _PyEval_EvalCodeWithName () 
#37 0x00005555556ec941 in fast_function ()
#38 0x00005555556f2755 in call_function ()
#39 0x0000555555714cba in _PyEval_EvalFrameDefault () 
#40 0x00005555556ec70b in fast_function ()
#41 0x00005555556f2755 in call_function ()
#42 0x0000555555714cba in _PyEval_EvalFrameDefault ()
#43 0x00005555556ebdae in _PyEval_EvalCodeWithName ()
#44 0x00005555556ec941 in fast_function ()
#45 0x00005555556f2755 in call_function ()
#46 0x0000555555715a7a in _PyEval_EvalFrameDefault ()
#47 0x00005555556ed459 in PyEval_EvalCodeEx ()
#48 0x00005555556ee1ec in PyEval_EvalCode ()
---Type <return> to continue, or q <return> to quit---
#49 0x00005555557689a4 in run_mod ()
#50 0x0000555555768da1 in PyRun_FileExFlags ()
#51 0x0000555555768fa4 in PyRun_SimpleFileExFlags ()
#52 0x000055555576ca9e in Py_Main ()
#53 0x00005555556344be in main ()

I found that my issue is due to import of sentencepiece.


changing the import order fixed that