Cuda runtime error (59) : device-side assert triggered

I have a cuda tensor on some device, say 7. I am simply trying to convert it to a numpy-like array with

def np_like(probs):
   return probs.data.cpu().numpy().squeeze()

What I have found interesting is the way this simple process triggers a lot of device-side assertion errors:

/opt/conda/conda-bld/pytorch_1512383260527/work/torch/lib/THC/THCTensorIndex.cu:279: void indexSelectSmallIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [0,0,0], thread: [24,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1512383260527/work/torch/lib/THC/THCTensorIndex.cu:279: void indexSelectSmallIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [0,0,0], thread: [25,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1512383260527/work/torch/lib/THC/THCTensorIndex.cu:279: void indexSelectSmallIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [0,0,0], thread: [26,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1512383260527/work/torch/lib/THC/THCTensorIndex.cu:279: void indexSelectSmallIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [0,0,0], thread: [27,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1512383260527/work/torch/lib/THC/THCTensorIndex.cu:279: void indexSelectSmallIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [0,0,0], thread: [28,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1512383260527/work/torch/lib/THC/THCTensorIndex.cu:279: void indexSelectSmallIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [0,0,0], thread: [29,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1512383260527/work/torch/lib/THC/THCTensorIndex.cu:279: void indexSelectSmallIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [0,0,0], thread: [30,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1512383260527/work/torch/lib/THC/THCTensorIndex.cu:279: void indexSelectSmallIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [0,0,0], thread: [31,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1512383260527/work/torch/lib/THC/generic/THCTensorCopy.c line=70 error=59 : device-side assert triggered
node.value  Traceback (most recent call last):
  File "varian_main.py", line 382, in <module>
    neural_fsp.train_nets(mask.rsplit(sep=".")[0])
  File "varian_main.py", line 307, in train_nets
    best_node, self.player = mcts.run_tree_search(root_node, player=self.player)
  File "/home/lex/Documents/NNs/RadOncol/beam_optim/scripts/monte_carlo/mcts.py", line 127, in run_tree_search
    new_node        = self.tree_policy(root_node)
  File "/home/lex/Documents/NNs/RadOncol/beam_optim/scripts/monte_carlo/mcts.py", line 162, in tree_policy
    return self.expand(node)
  File "/home/lex/Documents/NNs/RadOncol/beam_optim/scripts/monte_carlo/mcts.py", line 195, in expand
    maybe_child = self.action_score(maybe_child)
  File "/home/lex/Documents/NNs/RadOncol/beam_optim/scripts/monte_carlo/mcts.py", line 281, in action_score
    print('node.value ', node.value)
  File "/home/lex/anaconda3/envs/py35/lib/python3.5/site-packages/torch/autograd/variable.py", line 119, in __repr__
    return 'Variable containing:' + self.data.__repr__()
  File "/home/lex/anaconda3/envs/py35/lib/python3.5/site-packages/torch/tensor.py", line 133, in __repr__
    return str(self)
  File "/home/lex/anaconda3/envs/py35/lib/python3.5/site-packages/torch/tensor.py", line 140, in __str__
    return _tensor_str._str(self)
  File "/home/lex/anaconda3/envs/py35/lib/python3.5/site-packages/torch/_tensor_str.py", line 295, in _str
    strt = _vector_str(self)
  File "/home/lex/anaconda3/envs/py35/lib/python3.5/site-packages/torch/_tensor_str.py", line 271, in _vector_str
    fmt, scale, sz = _number_format(self)
  File "/home/lex/anaconda3/envs/py35/lib/python3.5/site-packages/torch/_tensor_str.py", line 79, in _number_format
    tensor = torch.DoubleTensor(tensor.size()).copy_(tensor).abs_().view(tensor.nelement())
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1512383260527/work/torch/lib/THC/generic/THCTensorCopy.c:70

I have seen similar errors on the github issues page and discuss page but no suggestion has been able to fix my problem so far.

I am on pytorch 0.3.0

Ah never mind, I was using index_select somewhere in my code. This was the culprit as pointed out by others erstwhile.

Sorry for the bother.

Hi, I am facing the same error as your question. And I don’t use “index_select”, just select by generated index, would you please tell me how did you fix your problem?