Keypoint rcnn training error

I get this strange error when I run my code for training a Keypoint RCNN FPN with a ResNet50 backbone:

Exception has occurred: IndexError
index 3 is out of bounds for dimension 0 with size 1
  File "keypointrcnn.py", line 89, in <module>
    outputs = model(images, targets)
IndexError: index 3 is out of bounds for dimension 0 with size 1

Some context: I have dataset where every image is taken of one individual which is annotated with exactly 5 bounding boxes and 8 key points. The following dimensions for the targets[‘boxes’] is [5,4], targets[‘labels’] is [5], targets[‘keypoints’] is [1,8,3].

Here is the code for fetching the model:

def getkeypointmodel(num_classes, num_keypoints):
    model = torchvision.models.detection.keypointrcnn_resnet50_fpn(pretrained=False,
                                                                   pretrained_backbone=True,
                                                                   num_keypoints=num_keypoints,
                                                                   num_classes = num_classes)
    return model

Below is the training loop for my model:

Train the model

device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
model.to(device)
num_epochs = 1
for epoch in range(num_epochs):
    for images, targets in data_loader_training:
        optimizer.zero_grad()
        images = [image.to(device) for image in images]
        targets = [{k: v.to(device) for k, v in t.items()} for t in targets]

        img = images[0].shape
        tb = targets[0]['boxes'].shape
        tl = targets[0]['labels'].shape
        tk = targets[0]['keypoints'].shape

        # Forward pass
        outputs = model(images, targets)
        loss_dict = outputs['losses']
        losses = sum(loss for loss in loss_dict.values())

        # Backward pass and optimization
        losses.backward()
        optimizer.step()

    # Update the learning rate
    lr_scheduler.step()

Hope some can explain this strange error message to me.

Thanks in advance!

> Preformatted text

Actually the torchvision.models.detection.keypointrcnn_resnet50_fpn is expecting the input to be a list of 2 tensors of shape (3, 300, 400) while you are providing it two lists of some shape to it , so it is causing error as in the list named images i.e. the 0th dimension of input to model, don’t have anything at index 3.

so please modify the shape of input you are giving to model

i hope it helps you

Hmmm, I think you are talking about passing it a batch of image tensors, well I am doing that, but during training it also needs to be passed a list of targets as well. So I don’t think merging the images tensors with the targets is the correct thing to do unfortunately…