Two forward passes during training yield lower accuracy

I have the below code

            criterion = nn.CrossEntropyLoss(reduction='mean', ignore_index=-1)

            pred_weak = model(train_u_img)
            pred_str  = model(train_u_strong)
            pred, _ = model(train_l_img)

            sup_loss = criterion(pred, train_l_label)

            loss = sup_loss


My question is why this 3 forward passes make the model to perform worse than 1 forward pass. Although I need to take the values pred_weak , pred_str. How I can deal with it?


You are not sharing a lot of information how the performance is measured etc., but I would guess you are using e.g. batchnorm layers, which will update their running stats in each forward pass. If the data from the additional forward passes doesn’t fit into the data domain from your actual validation dataset, the performance might decrease.

Actually I’m training a semi-supervised model. The train_u_img are the unlabeled images and the train_u_strong are the same unlabeled images which are strongly augmented. So, I want to forward pass 2 times these values and 1 time the actual label images. Additionally, I measure the unsupervised loss as the MSE of the pred_weak and pred strong like this: csst_loss = criterion_csst(softmax_pred_strong, softmax_pred_weak)