Why does my deep-learning model (for classification problem) get all predicted values converge to a single label?

Small_Red_69 · August 10, 2022, 2:43am

Hi, friends,

Now I am exploring another PyTorch Deep-Learning model upon patients’ medical-appoint booking behaviours. This time it is to predict which weekday (from Monday to Friday) a patient will book a medical appointment. The range is 0 (for Monday) to 4 (for Friday).
This is not a so complex user case. However, the accuracy rate of my model is still quite low: even after 1,000 epochs, the accuracy rate is only 25.3%, and the average loss is 1.608337.
However, low accuracy rate is not the whole problem. To be worse, after debugging the deep-learning model, I found that somehow the output values of the model actually converges to 4 (Friday). It is understandable that the label 4 gets the highest percentage (is this due to a “relexable Friday” many people appreciate?) among the training data (26.2%) and the testing data (25.3%, that’s why the accuracy rate is 25.3%!), but all predicted results converging to a single label is still quite abnormal.
The scenario of all-predicted-result-convert-to-a-single-label is shown in the output text below:
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
testing–>pred = tensor([[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038]])
testing–>y = tensor([0, 2, 0, 2, 2, 2, 4, 1, 4, 0])
testing–>loss = 1.7076622247695923
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>(pred.argmax(1) == y) = tensor([False, False, False, False, False, False, True, False, True, False])
testing–>corr = 2.0
testing–>pred = tensor([[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038]])
testing–>y = tensor([0, 4, 4, 4, 1, 1, 4, 2, 1, 4])
testing–>loss = 1.4987714290618896
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>(pred.argmax(1) == y) = tensor([False, True, True, True, False, False, True, False, False, True])
testing–>corr = 5.0
testing–>pred = tensor([[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038],
[-0.4787, 0.1151, 0.0034, 0.0423, 0.3038]])
testing–>y = tensor([2, 1, 2, 4, 3, 4, 0, 4, 4])
testing–>loss = 1.5375866889953613
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>(pred.argmax(1) == y) = tensor([False, False, False, True, False, True, False, True, True])
testing–>corr = 4.0
Test Error:
Accuracy: 25.3%, Avg loss: 1.608337
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
In my model, the data tensor is a vector composed of the following 7 fields:

Gender: 1 – male, 2 – female
Age
Area: I take the first 3 digits of a patient’s residential postal codes and map to an integer from 0 to 999.
Medical Examination: 1 – yes, 0 – no
Blood Test: 1 – yes, 0 – no
Urine Test: 1 – yes, 0 – no
Fasting: 1 – yes, 0 – no

The labels of my model are from 0 (for Monday) to 4 (for Friday), totally 5 categories.
Totally I prepare 397 training data and 99 testing data for the deep-learning model.

My deep-learning model has followed the Fashion-MNIST learning model from the tutorials in pytorch.org official website (see the hyperlink Optimizing Model Parameters — PyTorch Tutorials 1.12.0+cu102 documentation). In the model, I also take the following parameters: the size of middle layer in the Neural Network is 512, the learning rate is 0.01 and the batch size is 10. (At the end of this post, I will attach the full set of my python source codes for your reference).

Can anybody help me to analyse and diagnose why all predicted values will converge to a single value? Is this due to the “overfitting” issue or the learning rate is still too high?

Thanks for the help. If you need more info to investigate the issue, please just let me know.

My source codes of the Pytorch deep-learning model:

import torch
from torch import nn
from torch.utils.data import Dataset, DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor, Lambda
import numpy as np
import pandas as pd

class CustomDataset(Dataset):
‘Characterizes a dataset for PyTorch’
def init(self, csv_file):
“”"
Args:
csv_file (string): Path to the csv file.
“”"
raw_data = pd.read_csv(csv_file)
raw_data = torch.tensor(raw_data.to_numpy())
x_size = list(raw_data.size())[1]
self.data_tensor = raw_data[:,:x_size-1].clone()
self.data_tensor = self.data_tensor.type(torch.float32)
self.label_tensor = raw_data[:,x_size-1:].clone()
self.label_tensor = self.label_tensor.flatten()
print(f"self.label_tensor.size() = {self.label_tensor.size()}")
def len(self):
‘Denotes the total number of samples’
return len(self.data_tensor)
def getitem(self, index):
data, label = self.data_tensor[index], self.label_tensor[index]
return data, label

training_data_csv_file = “D:\Tools\PyTorch\Medex-Deep-Learning-Model\input_data\training_data.csv”
training_data = CustomDataset(training_data_csv_file)
test_data_csv_file = “D:\Tools\PyTorch\Medex-Deep-Learning-Model\input_data\testing_data.csv”
test_data = CustomDataset(test_data_csv_file)

train_dataloader = DataLoader(training_data, batch_size=10)
test_dataloader = DataLoader(test_data, batch_size=10)

class NeuralNetwork(nn.Module):
def init(self):
super(NeuralNetwork, self).init()
self.flatten = nn.Flatten()
self.linear_relu_stack = nn.Sequential(
nn.Linear(7, 512),
nn.ReLU(),
nn.Linear(512, 512),
nn.ReLU(),
nn.Linear(512, 5),
)

def forward(self, x):
    x = self.flatten(x)
    logits = self.linear_relu_stack(x)
    return logits

model = NeuralNetwork()

def train_loop(dataloader, model, loss_fn, optimizer):
size = len(dataloader.dataset)
print(f’size = {size}‘);
for batch, (X, y) in enumerate(dataloader):
# Compute prediction and loss
pred = model(X)
print(f’pred = {pred}’);
print(f’y = {y}‘);
loss = loss_fn(pred, y)
print(f’loss = {loss}’);
# Backpropagation
optimizer.zero_grad()
loss.backward()
optimizer.step()

    if batch % 100 == 0:
        loss, current = loss.item(), batch * len(X)
        print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")

learning_rate = 0.01
batch_size = 10

def test_loop(dataloader, model, loss_fn):
size = len(dataloader.dataset)
num_batches = len(dataloader)
test_loss, correct = 0, 0

with torch.no_grad():
    for X, y in dataloader:
        pred = model(X)
        print(f'testing-->pred = {pred}');
        print(f'testing-->y = {y}');
        loss = loss_fn(pred, y).item()
        print(f'testing-->loss = {loss}');
        test_loss += loss
        print(f'testing-->pred.argmax(1) = {pred.argmax(1)}');
        print(f'testing-->(pred.argmax(1) == y) = {(pred.argmax(1) == y)}');
        corr = (pred.argmax(1) == y).type(torch.float).sum().item()
        print(f'testing-->corr = {corr}');
        correct += corr
test_loss /= num_batches
correct /= size
print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")

loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

epochs = 1000
for t in range(epochs):
print(f"Epoch {t+1}\n-------------------------------")
train_loop(train_dataloader, model, loss_fn, optimizer)
test_loop(test_dataloader, model, loss_fn)
print(“Done!”)

Small_Red_69 · August 10, 2022, 2:57am

Suddenly I come to understand what the issue could be: this may caused by the argmax(1) function. This is because after 1000 epochs, each element of the predicted output tensor has converged to a constant. At this point, when I use the argmax(1) function, then the index of the largest value, which is 4, will always be selected at the moment, that’s why all the predicted values will be converged to the label 4.

However, instead of argmax(4), which function(s) can I use to cater to the labels from the testing data more accurately?

Thanks.

ptrblck · August 10, 2022, 3:47am

torch.argmax will just return the index of the largest logit in your model’s output, i.e. the predicted class label in your case, and is not at fault.
Your model seems to overfit to class4 in this run so you would need to check why that’s the case. E.g. maybe you are dealing with an imbalanced dataset where class4 is the majority class.

Small_Red_69 · August 11, 2022, 1:37am

Hi, ptrblck,

you mentioned “maybe you are dealing with an imbalanced dataset where class4 is the majority class”.

But sorry, no, this is not the scenario. Actually my learning case is predict which weekday (from Monday to Friday) a patient is going to book an appointment, based on the gender, the age, the residential area, and the checking items (either a common medical checkup or a blood test or a urine test or an ECG test or multiple of them) of the patient.

Actually class4 is cast to Friday, from my training data, there are 126 out of 488 cases for a Friday appointment, occupying 25.8%. It is somewhat understandable that Friday cases are significantly more than other weekdays (maybe because many people may feel that Friday is quite relaxed?), but they don’t take the majority.

I can show you the distribution of all weekdays in the training data:
Monday: 11.9%
Tuesday: 21.7%
Wednesday: 21.3%
Thursday: 19.3%
Friday: 25.8%

So you can see that the training dataset is not imbalanced.

Thanks.

ptrblck · August 11, 2022, 4:35am

You are right that the class distribution is not heavily imbalanced, but still Friday is the majority class (i.e. class with the largest number of samples) and your model seems to overfit to it.

IneedMrmeeseeks · August 11, 2022, 5:08am

Can you show us the training loss graph for each epoch?
And I’d like to suggest that you need to do nomalization when you make CustomDataset class.

Small_Red_69 · August 16, 2022, 8:57am

Hi, Ineed,

Just now I got the training loss graph and want to show here:

Please kindly review it.

Also can you please show me the normalization details when making CustomDataset class?

Thank you.

Small_Red_69 · August 16, 2022, 9:08am

Hi, Ineed,

Right now I double-checked my training data, and found that the labels are already normalized to be within the range 0 to 4 (please see the blue rectangle in the screenshot below).

Can you please highlight what else I can do the normalize the CustomDataset class?

For your reference, I would like to show the source codes of my CustomDataset class below.

======================================================================
class CustomDataset(Dataset):
‘Characterizes a dataset for PyTorch’
def init(self, csv_file):
“”"
Args:
csv_file (string): Path to the csv file.
“”"
raw_data = pd.read_csv(csv_file)
raw_data = torch.tensor(raw_data.to_numpy())
x_size = list(raw_data.size())[1]
self.data_tensor = raw_data[:,:x_size-1].clone()
self.data_tensor = self.data_tensor.type(torch.float32)
self.label_tensor = raw_data[:,x_size-1:].clone()
self.label_tensor = self.label_tensor.flatten()
print(f"self.label_tensor.size() = {self.label_tensor.size()}")
def len(self):
‘Denotes the total number of samples’
return len(self.data_tensor)
def getitem(self, index):
data, label = self.data_tensor[index], self.label_tensor[index]
return data, label

Thanks.

Small_Red_69 · August 17, 2022, 2:21am

Hi, Ineed,

Right now I tried to customize my learning model by decreasing the learning rate from 0.01 to 0.001. After running 1000 epochs, I can get a nearly-perfect training-loss decrement curve (as shown below):

Moreover, the cater rate has also increased from 30.9% to 41.4%. However, the predicted tensor values are still somehow converged to label 4 (see the output text below):

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
testing–>corr = 5.0
testing–>pred = tensor([[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[ 1.1050, -1.1411, 0.8514, 0.0696, 0.6602]])
testing–>y = tensor([0, 0, 0, 0, 4, 0, 4, 4, 3, 2])
testing–>loss = 1.6169509887695312
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 0])
testing–>(pred.argmax(1) == y) = tensor([False, False, False, False, True, False, True, True, False, False])
testing–>corr = 3.0
testing–>pred = tensor([[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451],
[-0.3121, -0.1847, -0.0434, 0.0043, 0.4451]])
testing–>y = tensor([1, 3, 4, 4, 4, 2, 2, 4, 0])
testing–>loss = 1.4929860830307007
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>(pred.argmax(1) == y) = tensor([False, False, True, True, True, False, False, True, False])
testing–>corr = 4.0
Test Error:
Accuracy: 41.4%, Avg loss: 1.523919

Done!
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Do you have any good approaches to resolve the “convergency” issue?

Thanks.

Small_Red_69 · August 17, 2022, 3:01am

Hi, Ineed & ptrblck,

Just now I reduced the # of neurons in the middle-layer from 512 to 10 (see my NN class source codes below):

====================================================================
class NeuralNetwork(nn.Module):
def init(self):
super(NeuralNetwork, self).init()
self.flatten = nn.Flatten()
self.linear_relu_stack = nn.Sequential(
nn.Linear(7, 10),
nn.ReLU(),
nn.Linear(10, 10),
nn.ReLU(),
nn.Linear(10, 5),
)

def forward(self, x):
    x = self.flatten(x)
    logits = self.linear_relu_stack(x)
    return logits

====================================================================

I re-ran 1000 epochs, then got a super-perfect training-loss curve (see the diagram below)! Quite amazingly!

However, just the “convergency to label 4” issue is still pending. Please kindly help to give me some ideas on this issue.

Thanks!

IneedMrmeeseeks · August 17, 2022, 4:29am

Hi, I had national holiday yesterday.

You can customize your Dataset class. In getitem function, you can add some normalization lines.

Or you nomalize your dataframe first and then load the dataframe with this class. (you don’t need to normalize dependent variable.)
It depends on your taste.
Calculate mean and variance or Use scikit.preprocessing.

Also I recommend to use nn.softmax at the end of layers for interpretability.

Oops! I forgot to mention that you also need to preprocess the categorical variables.(binary 0, 1 will be fine. Gender should be changed to be 0, 1.)(Look up one-hot encoding or embedding)
Normalization is for the continuous variables.

IneedMrmeeseeks · August 17, 2022, 4:31am

And I want to notify you that reducing learning rates is irrelevant with your issue.
It may seems perfect loss curve, but It actually shows same loss value.

Small_Red_69 · August 17, 2022, 8:41am

Hi, Ineed,

Thanks a lot for your feedback.
May I keep on clarifying with you something?

“Also I recommend to use nn.softmax at the end of layers for interpretability.”

Do you mean that in the output layer, I will apply the nn.softmax function to normalize the predicted tensors? An example of a predicted tensor is shown below:

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
testing–>pred = tensor([[-0.7003, -0.8617, -0.5095, -0.5607, -0.0856],
[-1.8335, -1.4326, -1.3105, -1.2570, -0.6620],
[-1.0715, -1.0413, -0.7767, -0.7803, -0.2806],
[-0.5288, -0.8138, -0.3624, -0.4996, 0.0346],
[-0.9010, -0.9953, -0.6293, -0.7216, -0.1595],
[-1.1588, -1.1293, -0.8057, -0.8878, -0.2828],
[-0.3873, -0.7340, -0.2692, -0.4018, 0.0979],
[-0.6543, -0.8487, -0.4722, -0.5421, -0.0564],
[-0.8280, -0.9215, -0.6081, -0.6287, -0.1617]])
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

so do you mean that I need to perform the nn.Softmax(dim=1) operation upon the above-mentioned output tensors to normalize the final output tensors?

“I forgot to mention that you also need to preprocess the categorical variables.(binary 0, 1 will be fine. Gender should be changed to be 0, 1.)”

Yes, I will normalize the gender values as 0 and 1. However, I also define the other categorical variables (Age and Area) as discrete variables:

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Age: from 18 to 63, totally 46 possible values
Area: from 000 to 999, totally 1000 possible values
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

So please show me whether I also need to normalize the Age and Area variables to be within the range [0, 1], for example, regarding to an age values, I can cater 18 to value 0, cater 19 to value 0+1/45=0.022, so on and so forth, finally cater 63 to value 1. For Area values, I will cater 000 to 0.0, 001 to 0.001, so on and so forth, then cater 999 to 0.999.

Can I do things by this way?

Thanks.

IneedMrmeeseeks · August 17, 2022, 9:54am

Actually nn.CrossEntropyLoss does the logsoftmax for you. Here is the link
So, you don’t need to add a softmax layer. sry for the confusion.

Age, Area should be considered as continuous variables. That’s why I mentioned about normalization.
Your last paragraph explanation is so called “Min-Max normalization”.
I talked about more like standardization which use mean and variance(mean=0 and std=1).
Both ways would be fine.

edit) Sry again. Area is the post code!! Ok, then you may want to use embedding with fixed length.

Small_Red_69 · August 18, 2022, 2:06pm

Hi, Ineed,

Sorry what do you mean " you may want to use embedding with fixed length"? What is “embedding”?

Thanks.

Small_Red_69 · August 18, 2022, 2:36pm

Hi, Ineed,

I have almost tried all your proposed methods to address the “convergency to same label value” issue.

“So, you don’t need to add a softmax layer”

I already removed the software function from the NN structure. Please see the source codes of my NN class below:

=====================================================================
class NeuralNetwork(nn.Module):
def init(self):
super(NeuralNetwork, self).init()
self.flatten = nn.Flatten()
self.linear_relu_stack = nn.Sequential(
nn.Linear(7, 10),
nn.ReLU(),
nn.Linear(10, 10),
nn.ReLU(),
nn.Linear(10, 5),
)

def forward(self, x):
    x = self.flatten(x)
    logits = self.linear_relu_stack(x)
    return logits

=====================================================================

“Min-Max normalization”
I have implemented the “Min-Max normalization” upon the Gender, Age and Area fields, making all of their values within the range of [0, 1]: In the Gender field, I subtract all values by 1, making a male patient to get the value 1, and a female patient to get the value 0; In the Age field, I divided all values by 100, making the age values between 0.18 and 0.63; In the Area field, since there are 6-digit postal codes, at first I take the first 3 digits, to make the range between 0 and 999, and then divided them by 1000, making the area values between 0 and 1.
Please refer to the screenshot below to see my normalization result of the input data.

Already_Performed_Min_Max_Normalization_Upon_Gender_Age_Area_fields1366×768 227 KB

Thereafter I run 1000 epochs again. However, to be frustrated, all the predicted values are still converged to label 4. (see the output text below)

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])
testing–>pred.argmax(1) = tensor([4, 4, 4, 4, 4, 4, 4, 4, 4])
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Do you have any more approaches to resolve the “convergency” issue?

Thanks.

Small_Red_69 · August 19, 2022, 12:36am

Hi, Ineed,

Actually I want to share with you the full set of my training data & testing data, but don’t know how to upload csv files in this forum.

So far the size of the training data is only 118, and the testing data is only 29. Can the pending “convergency” issue be caused by these too small size training & testing dataset?

Please kindly help me to diagnose the issue.

Later on I will try to find a way to share my training/testing dataset with you.

IneedMrmeeseeks · August 19, 2022, 2:57am

Hi, there.
I have my own work to do. It’d be better if you look up the way to solve your problem.
I want you to know the way to solve, not just the answer.

Here is the article which is very similar with your problem. Take a look and I hope this article helps you.

Best regard.

Small_Red_69 · August 20, 2022, 3:01am

Hi, Ineed,

Just now I uploaded my DataSet csv files into Microsoft One-Drive for your reference.

Training Date

training_data.xlsx - Microsoft Excel Online (live.com)

Testing Data

testing_data.xlsx - Microsoft Excel Online (live.com)

It is ok that you are concentrated on your work at 1st. If only u get time, then you can help to view my case. No hurry.

Thank you for the help and sorry for any inconvenience caused.

Why does my deep-learning model (for classification problem) get all predicted values converge to a single label?

My source codes of the Pytorch deep-learning model:

epochs = 1000 for t in range(epochs): print(f"Epoch {t+1}\n-------------------------------") train_loop(train_dataloader, model, loss_fn, optimizer) test_loop(test_dataloader, model, loss_fn) print(“Done!”)

epochs = 1000
for t in range(epochs):
print(f"Epoch {t+1}\n-------------------------------")
train_loop(train_dataloader, model, loss_fn, optimizer)
test_loop(test_dataloader, model, loss_fn)
print(“Done!”)