I am using Torchvision FasterRCNN to apply object detection in the MS Coco dataset. I am using the instance_train2017.json annotation file.
My code for loading the dataloader is;
# selected class ids: extract class id from the annotation
coco_data_args = {'datalist':im_ids, 'coco_interface':coco_interface, 'coco_classes_idx':selected_class_ids,'stage':'train', 'adjusted_classes_idx':adjusted_class_ids}
coco_data = COCOData(**coco_data_args)
coco_dataloader_args = {'batch_size':Hyper.batch_size, 'shuffle':True}
coco_dataloader = data.DataLoader(coco_data, **coco_dataloader_args)
print(f"Size of the dataloader = {len(coco_dataloader)}")
step = 0
# initilze model, loss, etc
fasterrcnn_args = {'num_classes':81, 'min_size':512, 'max_size':800}
fasterrcnn_model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=False,**fasterrcnn_args)
The code where I execute the fasterrcnn model is;
for _, b in enumerate(coco_dataloader):
i += 1
if i % 100 == 0:
print(f"step {i}")
fasterrcnn_optimizer.zero_grad()
X,y = b
if Constants.device==T.device('cuda'):
X = X.to(Constants.device)
y['labels'] = y['labels'].to(Constants.device)
y['boxes'] = y['boxes'].to(Constants.device)
images = [im for im in X]
targets = []
lab={}
# THIS IS IMPORTANT!!!!!
# get rid of the first dimension (batch)
# IF you have >1 images, make another loop
# REPEAT: DO NOT USE BATCH DIMENSION
# Pytorch is sensitive to formats. Labels must be int64, bboxes float32, masks uint8
lab['boxes'] = y['boxes'].squeeze_(0)
lab['labels'] = y['labels'].squeeze_(0)
targets.append(lab)
# avoid empty objects
if len(targets)>0:
loss = fasterrcnn_model(images, targets)
total_loss = 0
for k in loss.keys():
total_loss += loss[k]
epoch_loss += total_loss.item()
total_loss.backward()
fasterrcnn_optimizer.step()
and it is falling over on this line in the above code; loss = fasterrcnn_model(images, targets)
The error I get is;
Traceback (most recent call last):
File "U:/705/cwk/mscoco/src/main.py", line 28, in <module>
main()
File "U:/705/cwk/mscoco/src/main.py", line 17, in main
train()
File "U:\705\cwk\mscoco\src\train.py", line 103, in train
loss = fasterrcnn_model(images, targets)
File "C:\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "C:\Anaconda3\lib\site-packages\torchvision\models\detection\generalized_rcnn.py", line 92, in forward
raise ValueError("All bounding boxes should have positive height and width."
ValueError: All bounding boxes should have positive height and width. Found invalid box [11.898031234741211, 225.0413055419922, 18.502750396728516, 225.0413055419922] for target at index 0.
Process finished with exit code 1
I am really surprised this is happening, I do not expect data problems from MS Coco. Is there a problem in the code? How do I fix this?