Compute validation loss for Faster RCNN

Hi, I’m doing object detection on a custom dataset using transfer learning from a pretrained Faster RCNN model.
I would like to compute validation loss at the end of each epoch. How can this be done?

If I run the code below (model in training mode) I get losses, but dropout isn’t deactivated, so I am wondering how ‘valid’ are these loss values. And running the model in eval mode only returns the predictions.

model.train()
for images, targets in data_loader_val:
    images = [image.to(device) for image in images]
    targets = [{k: v.to(device) for k, v in t.items()} for t in targets]

    with torch.no_grad():
        val_loss_dict = model(images, targets)
        print(val_loss_dict)
1 Like

I’m wondering the same thing. Did you find a solution? I was thinking of forcing the training mode on only some submodules (the ones that output losses).

I thought it through and came to the conclusion that validation loss is only meaningful when considered relatively to training loss. Training loss is computed with dropouts too, so they are comparable.

I guess for dropout I might be ok, but in general wouldn’t that screw modules like batch norm that keep running estimates?

Hello,

Did you reach any conclusion? I am working also at object detection in my custom dataset and I would like to check validation and training losses evolution, but I’m not sure if it is a good practice to use the .train() mode during evaluation.

@mapostig No, I guess it’s not a good practice to use model.train() mode in evaluation. You can use the same custom dataset class to create a different dataset loader for your evaluation dataset.

for phase in ['train','val']:
    if phase == 'train':
        model.train()
        #Training Part with backprog
    else:
        model.eval()
        #Just a forward Pass.

Some layers like Dropout and batch Norm will behaves differently under model.eval().
Further you can look to this discussion.

1 Like

Even Iam stuck at the same place. Is it possible to calculate validation loss properly?

@loicdtx did u got any solution for this problem

@Arun_Mohan, validation loss is just there to control for overfit during training; it has no analytical value. It’s therefore completely fine to compute it like I did in the original post (model in train mode and gradient deactivated).

3 Likes

@loicdtx thanks…Even I tried the same way. I think it is not an issue.

While I agree with those above arguing that train mode validation loss calculation is fine, there still a serious efficiency problem here.

If you also want the model outputs (for tracking IOU, accuracy, etc., which is often the case), then you need to run inference twice. Training mode for the loss, and eval mode for the outputs. Would be better to have both with a single pass!

2 Likes

I posted a hacky solution on StackOverflow to get both outputs and losses in a single pass: https://stackoverflow.com/questions/60339336/validation-loss-for-pytorch-faster-rcnn/65347721#65347721

Hi @Usama_Hasan,

Thanks for your answer. I am using the pretrained Faster RCNN model, and I see that the BatchNormalization Layers are Frozen:

FasterRCNN(
  (transform): GeneralizedRCNNTransform()
  (backbone): BackboneWithFPN(
    (body): IntermediateLayerGetter(
      (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
      (bn1): FrozenBatchNorm2d()
      (relu): ReLU(inplace=True)
      (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
      (layer1): Sequential(
        (0): Bottleneck(
          (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn1): FrozenBatchNorm2d()
          (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): FrozenBatchNorm2d()
          (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn3): FrozenBatchNorm2d()
          (relu): ReLU(inplace=True)
          (downsample): Sequential(
            (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
            (1): FrozenBatchNorm2d()
          )
        )

Then it should not affect the running stats right?

Hey @coolcucumber94 , You’re right it won’t affect the running stats.
Plus you can add your code inside code section or just write them between triple comas, it helps in debugging and understanding code.

Thanks for the reply.

Was this found to be appropriate? I’ve been told that batch norm layers and dropout needs to be in eval() mode, however, I’m only interested in calculating validation loss to save the “best” model

1 Like