RuntimeError: cublas runtime error


I get the following Assertion error during the start of the training(on GPU). But restarting the training or changing the GPUs sometimes continues the training without any errors. The same training procedure seems to work for other parameters without an issue.

/tmp/luarocks_cutorch-scm-1-4771/cutorch/lib/THC/ void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2]: block: [0,0,0], thread: [0,0,0] Assertion srcIndex < srcSelectDimSize failed.