Hi, I encountered the following assertion error when running my code on GPU (things are fine on CPU):
/b/wheel/pytorch-src/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [179,0,0], thread: [0,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/b/wheel/pytorch-src/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [179,0,0], thread: [1,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/b/wheel/pytorch-src/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [179,0,0], thread: [2,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
THCudaCheck FAIL file=/b/wheel/pytorch-src/torch/lib/THC/generic/THCTensorMath.cu line=226 error=59 : device-side assert triggered
Traceback (most recent call last):
....
x = torch.cat([y_tm1_embed.squeeze(0), ctx_tm1], 1)
File "torch/autograd/variable.py", line 836, in cat
return Concat(dim)(*iterable)
File "torch/autograd/_functions/tensor.py", line 310, in forward
return torch.cat(inputs, self.dim)
RuntimeError: cuda runtime error (59) : device-side assert triggered at /b/wheel/pytorch-src/torch/lib/THC/generic/THCTensorMath.cu:226
I used the flag CUDA_LAUNCH_BLOCKING=1
. The full code is a simple attention-based encoder-decoder, with y_tm1_embed
being the embedding of the previous word, and ctx_tm1
is the previous context vector, which is initialized by:
ctx_tm1 = Variable(torch.zeros(batch_size, self.args.hidden_size * 2), requires_grad=False)
if self.args.cuda:
ctx_tm1 = ctx_tm1.cuda()
Any suggestions? Thanks!