GRU error when it runs on GPUs

Traceback (most recent call last):
  File "/home/guoxi/work/SST/train.py", line 284, in <module>
    train(epoch, w1)
  File "/home/guoxi/work/SST/train.py", line 247, in train
    proposals = model(features)
  File "/home/guoxi/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/guoxi/work/SST/models.py", line 41, in forward
    rnn_output, _ = self.rnn(features)   #  xxxx,128,512
  File "/home/guoxi/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/guoxi/anaconda2/lib/python2.7/site-packages/torch/nn/modules/rnn.py", line 162, in forward
    output, hidden = func(input, self.all_weights, hx)
  File "/home/guoxi/anaconda2/lib/python2.7/site-packages/torch/nn/_functions/rnn.py", line 351, in forward
    return func(input, *fargs, **fkwargs)
  File "/home/guoxi/anaconda2/lib/python2.7/site-packages/torch/autograd/function.py", line 284, in _do_forward
    flat_output = super(NestedIOFunction, self)._do_forward(*flat_input)
  File "/home/guoxi/anaconda2/lib/python2.7/site-packages/torch/autograd/function.py", line 306, in forward
    result = self.forward_extended(*nested_tensors)
  File "/home/guoxi/anaconda2/lib/python2.7/site-packages/torch/nn/_functions/rnn.py", line 293, in forward_extended
    cudnn.rnn.forward(self, input, hx, weight, output, hy)
  File "/home/guoxi/anaconda2/lib/python2.7/site-packages/torch/backends/cudnn/rnn.py", line 305, in forward
    ctypes.c_void_p(fn.reserve.data_ptr()), fn.reserve.size(0)`Preformatted text`
RuntimeError: invalid argument 2: out of range at /pytorch/torch/lib/THC/generic/THCTensor.c:23

My code runs normal when it runs on CPU, but it will report this error when it runs on (a single) GPU.