Sudden deterioration of model

pointlessone · May 11, 2024, 8:02pm

I’m new to ML and PyTorch in particular so might’ve missed something but please help me understand what’s happening.

I have a simple model. It learns pretty well and pretty quickly reaches 100% accuracy but in the span of just two epochs it completely deteriorates. Loss shoots way up and accuracy becomes 0%. I’ve seen spikiness in my previous attempts and elsewhere on the internet. I’ve also seen a few question about this on StackOverflow and here on this forum but I haven’t seen an explanation of the phenomenon, at least not a one I can understand yet.

I have a Sequential model. I use SoftMarginLoss (I also tried MSELoss with about the same result) for loss function and Adam for optimiser. I have about 65k data points in total. I randomly select a quarter of the dataset for training in each epoch and 1000 points for testing.

This is a plot of Loss and Accuracy. As you can see, on epoch 41 accuracy tanks to 0 and loss shoots up to 0.8, higher even than the first epoch. It recovers to 0.69 but stays there for a long time. Can someone please explain what’s going on here?

ptrblck · May 12, 2024, 4:20pm

An accuracy of 0 is not expected even if the model diverges as it would be worse than a random predictor. I don’t know what kind of use case you are working on but e.g. in a binary classification use case you could just invert the predictions to achieve an accuracy of 100, so you should check what the random accuracy would be and why your model gets worse than that.

pointlessone · May 12, 2024, 8:14pm

The model is supposed to predict a series of binary outputs. An accuracy of 0% means that at least one of the outputs is not close enough to expectation. It doesn’t mean that all outputs are the opposite but it’s an intriguing idea and I will check if that’s actually the case. I’m probably not using “accuracy” in the industry-standard way so might be a bit confused about that.

I wonder though about the loss. Isn’t optimizer supposed to press it towards 0? Why does loss jump so high and can’t get back down?