I am getting CUDA TorchScript error when using TensorRT inside PyTorch. Below are the errors:
File "/home/ravi/yolact_edge/layers/output_utils.py", line 109, in postprocess
masks = crop(masks, boxes)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
File "/home/ravi/yolact_edge/layers/box_utils.py", line 316, in fallback_function
"""
h, w, n = masks.size()
x1, x2 = sanitize_coordinates(boxes[:, 0], boxes[:, 2], w, padding, cast=False)
~~~~~~~~~~~~~~~~~~~~ <--- HERE
y1, y2 = sanitize_coordinates(boxes[:, 1], boxes[:, 3], h, padding, cast=False)
File "/home/ravi/yolact_edge/layers/box_utils.py", line 297, in sanitize_coordinates
_x1 = _x1.long()
_x2 = _x2.long()
x1 = torch.min(_x1, _x2)
~~~~~~~~~ <--- HERE
x2 = torch.max(_x1, _x2)
x1 = torch.clamp(x1-padding, min=0)
RuntimeError: CUDA error: device-side assert triggered
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:605: indexSelectSmallIndex: block: [0,0,0], thread: [0,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:605: indexSelectSmallIndex: block: [0,0,0], thread: [1,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:605: indexSelectSmallIndex: block: [0,0,0], thread: [2,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:605: indexSelectSmallIndex: block: [0,0,0], thread: [3,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Traceback (most recent call last):
File "/home/ravi/yolact_edge/layers/output_utils.py", line 100, in postprocess
masks = proto_data @ masks.t()
RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`
The two errors from the above are RuntimeError: CUDA error: device-side assert triggered
and RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling 'cublasCreate(handle)'
Please note that these errors arrive only when all of the following conditions are met:
- When TensorRT is used.
- After processing a few frames
- When the parameter
score_threshold
is non-zero
Below are my environment details:
* python --version 3.6.9
* tensorrt.__version__ 7.2.2.3
* torch.__version__ 1.7.1
* torchvision.__version__ 0.8.2
* torch.version.cuda: 10.2
My question is how to debug the above errors which are caused by TensorRT?
PS: The source code is available here