Hi! I have a big issue in training a Keypoint R-CNN for detection of keypoints.
My dataset is composed of pictures with faces with boxes on eyes and as a keypoint, one per each box, the center of the eye.
the model definition is the one from the torchvision framework:
model = torchvision.models.detection.keypointrcnn_resnet50_fpn(pretrained=False, progress=True, num_classes=2, num_keypoints=1, pretrained_backbone=True)
I am training the network with the coco method:
construct an optimizer
params = [p for p in model.parameters() if p.requires_grad]
optimizer = torch.optim.SGD(params, lr=0.0001,
momentum=0.9, weight_decay=0.0005)
and a learning rate scheduler which decreases the learning rate by
10x every 3 epochs
lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer,
step_size=3,
gamma=0.1)
num_epochs = 7
for epoch in range(num_epochs):
# train for one epoch, printing every 10 iterations
train_one_epoch(model, optimizer, data_loader, device, epoch, print_freq=30)
lr_scheduler.step()
Now, this is the issue I got every time, I can’t even train for one epoch.
I have already checked that my data is correctly passed and the implementation of the data loader is like it is described in the documentation of torchvision.
The following image is showing the exception:
Does someone already encountered this problem? I thank you very much for helping me.