The vision modelzoo Fasterrcnn doc says the following :
During training, the model expects both the input tensors, as well as a targets dictionary, containing:
- boxes (
Tensor[N, 4]
): the ground-truth boxes in[x0, y0, x1, y1]
format, with values between0
andH
and0
andW
These are my sample inputs <H,W, [x0,y0,x1,y1]>:
124 898 tensor([21., 0., 88., 46.], device='cuda:0')
124 898 tensor([45., 51., 89., 98.], device='cuda:0')
124 898 tensor([ 50., 132., 76., 183.], device='cuda:0')
124 898 tensor([ 63., 227., 63., 349.], device='cuda:0')
124 898 tensor([ 63., 795., 63., 870.], device='cuda:0')
124 898 tensor([ 0., 229., 50., 261.], device='cuda:0')
124 898 tensor([ 11., 269., 49., 308.], device='cuda:0')
124 898 tensor([ 16., 314., 50., 349.], device='cuda:0')
124 898 tensor([ 72., 259., 122., 280.], device='cuda:0')
124 898 tensor([ 72., 290., 123., 324.], device='cuda:0')
124 898 tensor([ 43., 389., 89., 419.], device='cuda:0')
124 898 tensor([ 43., 426., 89., 470.], device='cuda:0')
124 898 tensor([ 55., 531., 71., 584.], device='cuda:0')
124 898 tensor([ 45., 619., 89., 666.], device='cuda:0')
124 898 tensor([ 50., 700., 76., 751.], device='cuda:0')
124 898 tensor([ 0., 815., 50., 847.], device='cuda:0')
124 898 tensor([ 72., 798., 123., 827.], device='cuda:0')
124 898 tensor([ 72., 835., 123., 867.], device='cuda:0')
124 898 tensor([ 78., 887., 89., 897.], device='cuda:0')
still I get the error ;
/opt/conda/conda-bld/pytorch_1556653215914/work/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [0,0,0] Assertion `t >= 0 && t < n_classes` failed.
File "/anaconda2/envs/py/lib/python3.7/site-packages/torchvision/models/detection/roi_heads.py", line 34, in fastrcnn_loss
sampled_pos_inds_subset = torch.nonzero(labels > 0).squeeze(1)
RuntimeError: copy_if failed to synchronize: device-side assert triggered
Am I missing something ? If it is not the actual arget values which are out of range then what else could cause this error ?