Hello. I am trying to train and evaluate a Mask-RCNN model based on the Pytorch Torchvision Object Detection Finetuning Tutorial. I’ve modified the code to fit in my own dataset. So, everything runs great, but I wanted to evaluate the model during the training process, so for each epoch, I could save the model if it has better scores. The problem is that the scores in the output don’t change at all, and sometimes the model even returns empty scores.
Here is the core of the training code and the part that i added to evaluate the model.
bestScore = 0
for epoch in range(int(num_epochs)):
model.train()
# train for one epoch, printing every 10 iterations
train_one_epoch(model, optimizer, data_loader, device, epoch, writer, print_freq=10)
# update the learning rate
lr_scheduler.step()
#####################################
#evaluate and save the model
model.eval()
soma = 0
qtd = 0
for image in data_loader_test:
#Here we create a list, because the model expects a list of Tensors
lista = []
#It is important to send the image to CUDA, otherwise it will try to execute in the CPU
x = image[0][0].cuda()
lista.append(x)
output = model(lista)
for item in output[0]['scores']:
soma += item.data.item()
qtd += 1
averageScores = soma/qtd
print("Epoch ", epoch, " score: ", averageScores)
if((averageScores) > bestScore):
print(averageScores, "is bigger than ", bestScore)
bestScore = averageScores
print("#### Saving the model ####")
path = "model.pth"
torch.save(model.state_dict(), path)
# #####################################
And this is a sample output of the model in “eval mode” that I often get:
[{'boxes': tensor([], device='cuda:0', size=(0, 4), grad_fn=<StackBackward>), 'labels': tensor([], device='cuda:0', dtype=torch.int64), 'scores': tensor([], device='cuda:0', grad_fn=<IndexBackward>), 'masks': tensor([], device='cuda:0', size=(0, 1, 480, 720))}]
I don’t know if this average of the scores is correct, and if this is a valid and proper way to evaluate this mask-RCNN model during training. Could you guys help me out with this one? Thanks in advance!