Why does my deep-learning model (for classification problem) get all predicted values converge to a single label?

Small_Red_69 · August 17, 2022, 2:21am

Hi, Ineed,

Right now I tried to customize my learning model by decreasing the learning rate from 0.01 to 0.001. After running 1000 epochs, I can get a nearly-perfect training-loss decrement curve (as shown below):

Moreover, the cater rate has also increased from 30.9% to 41.4%. However, the predicted tensor values are still somehow converged to label 4 (see the output text below):

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
testing–>corr = 5.0
testing–>pred = tensor([[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[ 1.1050, -1.1411, 0.8514, 0.0696, 0.6602]])
testing–>y = tensor([0, 0, 0, 0, 4, 0, 4, 4, 3, 2])
testing–>loss = 1.6169509887695312
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 0])
testing–>(pred.argmax(1) == y) = tensor([False, False, False, False, True, False, True, True, False, False])
testing–>corr = 3.0
testing–>pred = tensor([[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451]])
testing–>y = tensor([1, 3, 4, 4, 4, 2, 2, 4, 0])
testing–>loss = 1.4929860830307007
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>(pred.argmax(1) == y) = tensor([False, False, True, True, True, False, False, True, False])
testing–>corr = 4.0
Test Error:
Accuracy: 41.4%, Avg loss: 1.523919

Done!
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Do you have any good approaches to resolve the “convergency” issue?

Thanks.