How to get losses and predictions at the same time?

Hi all,

I’m currently working on a project based on the finetuning tutorial. I have a model object the same as they have it in the tutorial.

Now, for validation purposes I need two things: first, I want the validation losses and second, I want the predictions in order to get the Mean Average Precision.

According to the docs:
"During training, the model expects both the input tensors, as well as a targets (list of dictionary), containing boxes, labels, masks. The model returns a Dict[Tensor] during training, containing the classification and regression losses for both the RPN and the R-CNN, and the mask loss.

During inference, the model requires only the input tensors, and returns the post-processed
predictions as a List[Dict[Tensor]], one for each input image. The fields of the Dict are as
follows: boxes, labels, scores, masks."

So, to get the losses we do:

losses = model(images, targets)

And to get the predictions we do:

predictions = model(images)

I need both things. Losses to measure the validation error and the predictions to measure MAP. I’d want something like:

losses, predictions = model(images, targets)

Does anyone know if it is possible with this implementation? I could get both things by just doing those two lines of code, but that means iterating through the dataset twice, something that can take a long time.

1 Like

Based on this code snippet from the tutorial:

# For Training
images,targets = next(iter(data_loader))
images = list(image for image in images)
targets = [{k: v for k, v in t.items()} for t in targets]
output = model(images,targets)   # Returns losses and detections
# For inference
x = [torch.rand(3, 300, 400), torch.rand(3, 500, 400)]
predictions = model(x)           # Returns predictions

it seems the model returns losses and detections during training (i.e. if the model is in .train() mode), so wouldn’t this work for your use case?

Unfortunately, I don’t think that works. Even though that comment says it returns losses and detections, it only returns losses when you feed it both images and targets.

I did a couple of prints right after the code snippet you mentioned:´

print("Output:", output)
print("Predictions:", predictions)

and it yields:

Output: {'loss_classifier': tensor(0.1091, grad_fn=<NllLossBackward0>), 'loss_box_reg': tensor(0.0480, grad_fn=<DivBackward0>), 'loss_objectness': tensor(0.0275, grad_fn=<BinaryCrossEntropyWithLogitsBackward0>), 'loss_rpn_box_reg': tensor(0.0034, grad_fn=<DivBackward0>)}

Predictions: [{'boxes': tensor([], size=(0, 4), grad_fn=<StackBackward0>), 'labels': tensor([], dtype=torch.int64), 'scores': tensor([], grad_fn=<IndexBackward0>)}, {'boxes': tensor([], size=(0, 4), grad_fn=<StackBackward0>), 'labels': tensor([], dtype=torch.int64), 'scores': tensor([], grad_fn=<IndexBackward0>)}]

So, feeding a model with only images gives a prediction. Feeding images and targets gives the loss.