I have a cuda tensor on some device, say 7. I am simply trying to convert it to a numpy-like
array with
def np_like(probs):
return probs.data.cpu().numpy().squeeze()
What I have found interesting is the way this simple process triggers a lot of device-side assertion errors:
/opt/conda/conda-bld/pytorch_1512383260527/work/torch/lib/THC/THCTensorIndex.cu:279: void indexSelectSmallIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [0,0,0], thread: [24,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1512383260527/work/torch/lib/THC/THCTensorIndex.cu:279: void indexSelectSmallIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [0,0,0], thread: [25,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1512383260527/work/torch/lib/THC/THCTensorIndex.cu:279: void indexSelectSmallIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [0,0,0], thread: [26,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1512383260527/work/torch/lib/THC/THCTensorIndex.cu:279: void indexSelectSmallIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [0,0,0], thread: [27,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1512383260527/work/torch/lib/THC/THCTensorIndex.cu:279: void indexSelectSmallIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [0,0,0], thread: [28,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1512383260527/work/torch/lib/THC/THCTensorIndex.cu:279: void indexSelectSmallIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [0,0,0], thread: [29,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1512383260527/work/torch/lib/THC/THCTensorIndex.cu:279: void indexSelectSmallIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [0,0,0], thread: [30,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1512383260527/work/torch/lib/THC/THCTensorIndex.cu:279: void indexSelectSmallIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [0,0,0], thread: [31,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1512383260527/work/torch/lib/THC/generic/THCTensorCopy.c line=70 error=59 : device-side assert triggered
node.value Traceback (most recent call last):
File "varian_main.py", line 382, in <module>
neural_fsp.train_nets(mask.rsplit(sep=".")[0])
File "varian_main.py", line 307, in train_nets
best_node, self.player = mcts.run_tree_search(root_node, player=self.player)
File "/home/lex/Documents/NNs/RadOncol/beam_optim/scripts/monte_carlo/mcts.py", line 127, in run_tree_search
new_node = self.tree_policy(root_node)
File "/home/lex/Documents/NNs/RadOncol/beam_optim/scripts/monte_carlo/mcts.py", line 162, in tree_policy
return self.expand(node)
File "/home/lex/Documents/NNs/RadOncol/beam_optim/scripts/monte_carlo/mcts.py", line 195, in expand
maybe_child = self.action_score(maybe_child)
File "/home/lex/Documents/NNs/RadOncol/beam_optim/scripts/monte_carlo/mcts.py", line 281, in action_score
print('node.value ', node.value)
File "/home/lex/anaconda3/envs/py35/lib/python3.5/site-packages/torch/autograd/variable.py", line 119, in __repr__
return 'Variable containing:' + self.data.__repr__()
File "/home/lex/anaconda3/envs/py35/lib/python3.5/site-packages/torch/tensor.py", line 133, in __repr__
return str(self)
File "/home/lex/anaconda3/envs/py35/lib/python3.5/site-packages/torch/tensor.py", line 140, in __str__
return _tensor_str._str(self)
File "/home/lex/anaconda3/envs/py35/lib/python3.5/site-packages/torch/_tensor_str.py", line 295, in _str
strt = _vector_str(self)
File "/home/lex/anaconda3/envs/py35/lib/python3.5/site-packages/torch/_tensor_str.py", line 271, in _vector_str
fmt, scale, sz = _number_format(self)
File "/home/lex/anaconda3/envs/py35/lib/python3.5/site-packages/torch/_tensor_str.py", line 79, in _number_format
tensor = torch.DoubleTensor(tensor.size()).copy_(tensor).abs_().view(tensor.nelement())
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1512383260527/work/torch/lib/THC/generic/THCTensorCopy.c:70
I have seen similar errors on the github issues page and discuss page but no suggestion has been able to fix my problem so far.
I am on pytorch 0.3.0