Cuda Runtime Assertion Error:Index out of bound

Any help on the below error,reason of it and point of error

/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [14,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [15,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [16,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [17,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [18,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [19,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/p

Error Trace

  231         mask_feats = mask_roi_extractor(x[:mask_roi_extractor.num_inputs],
--> 232                                 enlarged_rois)
    233         if self.with_semantic:
    234             mask_semantic_feat = self.semantic_roi_extractor([semantic_feat],

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

/opt/conda/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py in new_func(*args, **kwargs)
    212                     digit_version(TORCH_VERSION) >= digit_version('1.6.0')):
    213                 with autocast(enabled=False):
--> 214                     output = old_func(*new_args, **new_kwargs)
    215             else:
    216                 output = old_func(*new_args, **new_kwargs)

/kaggle/working/mmdetection/mmdet/models/roi_heads/roi_extractors/single_level_roi_extractor.py in forward(self, feats, rois, roi_scale_factor)
     98                 roi_feats += roi_feats_t
     99                 continue
--> 100             inds = mask.nonzero(as_tuple=False).squeeze(1)
    101             if inds.numel() > 0:
    102                 rois_ = rois[inds]

RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1

Rerun the code via CUDA_LAUNCH_BLOCKING=1 python script.py args and check which operation is causing the out-of-bounds index error. Alternatively, run the code on the CPU which should also show the failing op.

at the site of error i have put two statements

print('mask fwd ','xleng',len(x),'roi',
                  len(x[:mask_roi_extractor.num_inputs]),mask_roi_extractor.num_inputs,
                   ,'x[0] shape',x[0].shape)
print('mask fwd not train',x[:mask_roi_extractor.num_inputs])

This is error i start getting


0/122, elapsed: 0s, ETA:torch.Size([100, 5]) mask_rois simple test (tensor([[[[ 1.2903e-01,  8.9258e-01,  7.3486e-01,  ...,  2.6465e+00,...  <this is printing some tensor some big value>
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:97: operator(): block: [0,0,0], thread: [82,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.

Below is the further error after this. During Train i dont get the error , only during the validation part at the same point I did turn on cuda launch

return self.roi_head.simple_test(
--> 184             x, proposal_list, img_metas, rescale=rescale)
    185 
    186     def aug_test(self, imgs, img_metas, rescale=False):

/kaggle/working/mmdetection/mmdet/models/roi_heads/dsc_roi_head.py in simple_test(self, x, proposal_list, img_metas, rescale)
    570                 print(mask_rois.shape,'mask_rois simple test',  x)
    571                 mask_results = self._mask_forward(self.num_stages, x, mask_rois, 
--> 572                                                   mpn_results['enlarged_rois'][det_rois_inds]
    573                                                   ,semantic_feat,
    574                                                   mpn_results['res_feat'][det_rois_inds],img_shape,

RuntimeError: CUDA error: device-side assert triggered

@ptrblck any possible issue with tensor type or so, During Training ,print prints all things but during validation only this issue come up. i.e print dsnt gives all values only ffirst few values

PyTorch won’t print a huge tensor completely to avoid spamming the terminal, so check where the invalid index in the tensor is and make sure to remove it.