Strange classification accuracy drop

Dian-Shan · August 8, 2020, 4:52pm

Hello,
I’m training an incremental classification model (ResNet-32) on the CIFAR-100 dataset.

Using cross-entropy loss, I can get the final 43.2% accuracy. I fixed random seed for each random module (eg Python built-in random module, np.random, torch, etc.), so I can get the same accuracy each time if I trained the model many times.

However, if I put some other torch.tensor on GPU, it causes the accuracy drop (42.6%). And indeed this torch.tensor is not used in training.

(torch.tensor example, self.class_sizes = torch.tensor([20, 20, 500, 500,.....]).view(-1, 1).to(self._device), and self.class_sizes doesn’t be used in training process. I just put this variable there.)

Does anyone have any ideas or encountered similar situations? Thanks!

conda environment

pytorch=1.3.0=py3.7_cuda10.0.130_cudnn7.6.3_0
torchvision=0.2.2
cudatoolkit=10.0.130=0

0809 Update:
Another similar situation
If I have a Python class (BigModel below), which includes the training/test procedure.

class AuxModel(nn.Module):
    def __init__(self):
        super(AuxModel, self).__init__()

        self.align_FC = nn.Linear(64, 1)

        for m in self.modules():
            if isinstance(m, nn.Linear):
                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
                if m.bias is not None:
                    m.bias.data.zero_()

    def forward(self, x):
        return self.align_FC(x)

class BigModel():
    def __init__(self, args):
           ....
           self._network = ResNet32(...)

           self.SP = AuxModel()

    def train(self, ...):
            ....

    def test(self, ...):
            ....

If I train the model (ResNet32), and totally not used self.SP, I will get lower accuracy (74.2%).

But if I directly comment out the #self.SP, I can get higher accuracy (76.6%). Another thing I can know is that I do not use the self.SP in my code, because I directly commented out that, I did not get any error message.

ptrblck · August 10, 2020, 10:04am

Since AuxModel initialized modules randomly, the pseudorandom number generator will be called and will thus change the random values of all following operations. The accuracy decrease seems to be a symptom of the changed seed and you should see the same effect, if you remove self.SP and rerun the code with different seeds.

Dian-Shan · August 11, 2020, 9:47am

Many thanks for your reply!

I will try to use different seeds and self.SP architectures to do some experiments, and report the results later.

Not mentioned above, different self.SP architectures cause different final accuracy drops! I think this observation also results from the symptom that you had mentioned, which mainly related to pseudorandom number generator)