Model initialized from scratch is not training at all

Hi,
I am trying to train a cnn to classify dogs, cats. But the network always outputs loss as 0.6931 and accuracy 50%. I have seen the same issue in this tutorial https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html
I expect the val accuracy to increase slowly in case of scratch_model but why it is not at all increasing?

Could you post your training code so that we can have a look and debug it?
Sometimes some hyperparameters are off, e.g. a too high learning rate was used.

Hi ptrblck
here is the code https://www.kaggle.com/bharat0/script-dogs-vs-cats/log?scriptVersionId=7399117

Thank you

Even pretrained is showing same results https://www.kaggle.com/bharat0/dog-vs-cat-pretrained-resnet?scriptVersionId=7404637
I think I made mistake somewhere, no sure where.

Thanks for the code! The second link seems to be dead.
Did the model work in any configuration before?
Your transformations seem to be a bit odd, as you specify some probabilities as 1 for random transformations, making them practically deterministic.

I would suggest to scale down some transformations and the model, and to try fitting a baseline model to the data first.


trans.RandomChoice([
        trans.RandomGrayscale(1),
        trans.RandomHorizontalFlip(1),
        trans.RandomVerticalFlip(1),
        trans.RandomAffine(180)
    ]),

I thought RandomChoice will choose only one among them so I kept probabilities to 1.

I have removed these transformations and rerun the script but with no improvement.
No the model did not work anytime and that is what I don’t know why.
When I run on single example(giving the same X all the time) the loss sometimes decreases and sometimes it won’t.
Please check the first link is now https://kaggle.com/bharat0/script-dogs-vs-cats?scriptVersionId=7412317

I am running the second one without transformations. Once it is done I will post the link to it

RandomChoice will randomly select one of these transformations.
However if you specify RandomHorizontalFlip with a probability of 1, each image will be flipped.
Try to create a simpler model without all dropout layers and maybe with less layers.
It’s quite hard to debug your training if the model as well as the preprocessing is complicated, since there are a lot of knobs to tune now. :wink:

Sure, I will decrease the model and try.
Preprocessing mainly consists of

img = Image.open(path_to_image)

y = 1 if 'dog' in path else 0
if self.transform:
     img = self.transform(img)
return img, y

path is like …/input/train/dog.123.jpg or …/input/train/cat.456.jpg

@ptrblck
I have minimized the model and I am thinking that

  1. I have used too many dropouts after every layer.
  2. Applying the transformations in the beginning is a bad idea.
    We should apply transformations after the net learns something
  3. Learning Rate is most important here
    I have used RMSprop optimizer which has default lr 0.01 and with this I the model is not learning.
    The model learning with Adam optimizer(default lr=0.001) and RMSprop with lr=0.001 is working.
    Then I have increased the model size and lr=0.001 is not working for both optimizer where as lr=0.0001 is working with both optimizers.
    I can say that adjusting learning rate made the difference here(more than transforms, dropout).

Am I right or am I missing anything?
The code is here: https://www.kaggle.com/bharat0/dog-vs-cat?scriptVersionId=7421775
and thank you very much for the support

1 Like

Your steps seem logic and apparently the debugging worked.
Congrats! Now if your model is learning, you could try to scale it up a bit, e.g. adding data augmentation, and see if you can get a better accuracy.

Yes, I will train on full data. The code consists of only a sample

Hi, I am currently using pretrained model like FasterRCNN , MaskRCNN ,etc but still getting false positives after fine-tuning…now I am planning to train a model from scratch(without using any pretrained model) but I am not sure it will work or not…so any suggestions ?

Also give me some information about object detection that is when to use pretrained with fine-tuning and model from scratch.

Thanks in advance :slight_smile: