Hi. I’m trying yolov3 transfer learning with NWPU dataset.
I want to detect only person, so nwpu.names file is like below.
person
During training, I got Runtime error(RuntimeError: CUDA error: device-side assert triggered).
Detail error is like below.
Training Epoch 0: 76%|███████▌ | 87/115 [04:48<00:36, 1.31s/it]/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [0,0,0], thread: [0,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [0,0,0], thread: [2,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [0,0,0], thread: [3,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [0,0,0], thread: [4,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [0,0,0], thread: [26,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [0,0,0], thread: [28,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
Training Epoch 0: 76%|███████▌ | 87/115 [05:07<01:38, 3.53s/it]
Traceback (most recent call last):
File "trainer.py", line 235, in <module>
trainer(opt.data_config, opt.multiscale_training, opt.img_size, opt.batch_size, opt.n_cpu, opt.model_def, opt.pretrained_weights,
File "trainer.py", line 191, in trainer
train_result = train(model, optimizer, train_dataloader, epoch, device, gradient_accumulations)
File "trainer.py", line 96, in train
loss, outputs = model(imgs, targets)
File "~~~/.pyenv/versions/yoloenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "~~~/models.py", line 266, in forward
x, layer_loss = module[0](x, targets, img_dim)
File "~~~/.pyenv/versions/yoloenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "~~~/vv-yolo/yolov3/models.py", line 191, in forward
iou_scores, class_mask, obj_mask, noobj_mask, tx, ty, tw, th, tcls, tconf = build_targets(
File "~~~/vv-yolo/yolov3/utils/utils.py", line 306, in build_targets
noobj_mask[b[i], anchor_ious > ignore_thres, gj[i], gi[i]] = 0
RuntimeError: CUDA error: device-side assert triggered
- Someone said it is input output dimension error, but I don’t think so.
- I also adjusted filters to match the number of classes.(one class → filters right before yolo layer are 18.)
- Someone said when using custom loss, there was an error and it was solved by adjusting the batch size. It doesn’t apply to me.
Can you have any solution for me? Thank you.