Same model,data and iteration,totally different forward output in train then eval mode,why?

Hello,as you can see,for same data in same iteration,when putting mode in train then eval mode,forward pass gives totally different outputs:

TRAIN output:[[  7.2314,  -9.7634], [-11.1361,  15.7153],  [ -6.0605,   7.8842],  [ -7.5571,  10.5594]]
EVAL   output:[[  9.7397, -12.8771],[  2.3994,  -3.3750], [  5.1334,  -6.9638], [  9.0693, -11.9809]]


would you please tell me possible reasons?

Hi, @AlexLuya -san,

It probably cause by train mode and eval mode are in same for loop.
After one training it updates parameters of your model has, so different result is presented on eval after the train.

Hello,

If your model has BatchNorm or DropOut modules this behavior is normal. They dont have same behavior in training and eval mode that is why you have different results

Thanks,
after changing code to:

 for batch, (images, labels) in dataIter:
            startTime = time.time()
            import torch
            with torch.no_grad():
                outputs = self.forward(images)
            # loss = self.criterion(outputs, labels.cuda())
            # self.backward(loss)
            self.backbone.eval()
            outputs_eval = self.forward(images)
            print("hello")

outputs is still totally different form output_eval

My suggestion is to make different loop for train and eval. After finishing for-loop for the train you obtain trained parameters, so you can verify your parameters and your model too.

Thanks,as you can see that,from second item,the classification results actually opposite against to each other in two modes

TRAIN output:[[  7.2314,  -9.7634], [-11.1361,  15.7153],  [ -6.0605,   7.8842],  [ -7.5571,  10.5594]]
EVAL   output:[[  9.7397, -12.8771],[  2.3994,  -3.3750], [  5.1334,  -6.9638], [  9.0693, -11.9809]]

and How can I use this evaluation result to evaluate the model?

As @111137 suggested, I think that you should separate the two for loops. That could be something like:

for e in range(nb_epochs):
     model.train()
     for batch, (images, labels) in dataIterTrain:
           optimizer.zero_grad()
           do your stuff
    model.eval()
    with torch.no_grad(): 
     for batch, (images, labels) in dataIterVal:
          validate your model
          

And don’t forget that some modules have different behavior dependently of the mode of the model. So even at the end of the training loop, if you do something like:

model.train()
output_train = model(input)
model.eval()
output_eval = model(input)

The two output will be different.

Let us consider about training and evaluation.

  • Training is to update parameters by back propagation.
  • Evaluation is to verify the result of training.

So these two modes should not been in same for-loop.
The sequence of your code means;

  1. train the model, generates result, and update parameters.
  2. eval the model using updated parameters, and generates result.
  3. repeat this sequence by for-loop (going back to 1)

So, different number is alway there, I think.
NOTE : training and validation have not use same data set, in order to check overfitting on your model.

Thanks,I actually do training and evaluation in different loop,and got train accuracy raising from 0.5 to 0.995 after several epochs,but evaluation accuracy stays at 0.5(a little bit higher or lower),due to this weird,I put some evaluation data in same loop to check the output in different mode,and found this,so I want to know why?

Reason is this, you do it in your code;

Means your program correctly works.

In these code:

 for batch, (images, labels) in dataIter:
            startTime = time.time()
            import torch
            with torch.no_grad():
                outputs = self.forward(images)
            # loss = self.criterion(outputs, labels.cuda())
            # self.backward(loss)
            self.backbone.eval()
            outputs_eval = self.forward(images)
            print("hello")

backward related code has been comment out,but outputs are still totally different.

Let you try to move the import to outside of the loop.
Your program does import every time in every iteration.
And tell me reason why use with torch.no_grad():.

When removing the “import” and the “with” statement, what happen?

@AlexLuya what are the modules of your backbone ?

InceptionResNetV2

So you are using InceptionResNetV2(module) which uses a F.dropout( 0.8 ) module by default. This module will change the behavior of your module in training or eval mode. In training mode, the dropout will supress some connections between the layers to overcome overfiting but in eval mode it will remain all the connection. So the output will change.

2 Likes

I think that if you set dropRate to 0 the output will be the same for both eval and train. (But in that case the DropOut module will be useless)

1 Like

Thanks,after moving
import torch
outside of the loop,these two outputs are still totally different.

For why using “with torch.no_grad()”:
After commenting out the backward,the computation graph won’t be deleted,and I got a CUDA out of memory exception,so I use
torch.no_grad():
to tell pytorch don’t hold computation graph

Ok,

Let you do the dropout rate = 0.0 as suggestion of @SoucheChapich -san. If result is same then reason is the dropout. If it still shows different number then the train and eval in same loop makes the difference.

Thank,I actually knew this,the problem is that the model already gives me 99.5% training accuracy,but evaluation accuracy remains at 50%,even using training data to do evaluation,the accuracy is still 0.5.and both train and eval accuracy computation used the same code.

That is strange because if you use exactly the same data and that you take care of the dropout, the results should be the same. Can you show us the entire model ? Not just the backbone