Hi, friends,

The user case of my self-created PyTorch Deep Learning model is based on patients’ medical-appoint booking behaviours. In details, it is to predict how many days in advance (except weekends and public holidays) a patient will book a medical appointment. The range is 2 to 20 days.

Actually this is quite a simple user case. However, the cater rate of my model is all the way very low: even after 1,000 epochs, the accuracy rate is only 30.9%, and the average loss is still as high as 2.058354.

In my model, the data tensor is indeed a vector composed of the following 7 fields:

- Gender: 1 – male, 2 – female
- Age
- Area: I take the first 3 digits of a patient’s residential postal codes and map to an integer from 0 to 999.
- Medical Examination: 1 – yes, 0 – no
- Blood Test: 1 – yes, 0 – no
- Urine Test: 1 – yes, 0 – no
- Fasting: 1 – yes, 0 – no

The labels of my model are taken by subtracting the booking-in-advance days by 2, so I get the range from 0 to 18, totally 19 categories.

Totally I prepare 377 training data and 94 testing data for the deep-learning model.

Basically my deep-learning model just follows the Fashion-MNIST learning model from the tutorials in pytorch.org official website (see the hyperlink Optimizing Model Parameters — PyTorch Tutorials 1.12.0+cu102 documentation). In the model, I also take the following parameters: the size of middle layer in the Neural Network is 512, the learning rate is 0.01 and the batch size is 10. (At the end of this post, I will attach the full set of my python source codes for your reference).

Can any experts help me to analyse and diagnose why my cater rate is so low? I feel that the low accuracy rate may be caused by one or multiple reasons as below:

- Training data size is too small.
- Running epochs (so far 1000 epochs) are not enough.
- The learning model is not efficient enough (do I need to consider some other algorithms the model? e.g., is the loss function good enough)?
- Other parameters, e.g., batch size, learning rate, and so on, may be misconfigured.

Thanks a lot for any help. If anybody needs to get more info to investigate the issue, please kindly let me know, and I will try my best to provide.

# My source codes of the Pytorch deep-learning model:

import torch

from torch import nn

from torch.utils.data import Dataset, DataLoader

from torchvision import datasets

from torchvision.transforms import ToTensor, Lambda

import numpy as np

import pandas as pd

class CustomDataset(Dataset):

‘Characterizes a dataset for PyTorch’

def **init**(self, csv_file):

“”"

Args:

csv_file (string): Path to the csv file.

“”"

raw_data = pd.read_csv(csv_file)

raw_data = torch.tensor(raw_data.to_numpy())

x_size = list(raw_data.size())[1]

self.data_tensor = raw_data[:,:x_size-1].clone()

self.data_tensor = self.data_tensor.type(torch.float32)

self.label_tensor = raw_data[:,x_size-1:].clone()

self.label_tensor = self.label_tensor.flatten()

print(f"self.label_tensor.size() = {self.label_tensor.size()}")

def **len**(self):

‘Denotes the total number of samples’

return len(self.data_tensor)

def **getitem**(self, index):

data, label = self.data_tensor[index], self.label_tensor[index]

return data, label

training_data_csv_file = “D:\Tools\PyTorch\Deep-Learning-Model\input_data\training_data.csv”

training_data = CustomDataset(training_data_csv_file)

test_data_csv_file = “D:\Tools\PyTorch\Deep-Learning-Model\input_data\testing_data.csv”

test_data = CustomDataset(test_data_csv_file)

train_dataloader = DataLoader(training_data, batch_size=10)

test_dataloader = DataLoader(test_data, batch_size=10)

class NeuralNetwork(nn.Module):

def **init**(self):

super(NeuralNetwork, self).**init**()

self.flatten = nn.Flatten()

self.linear_relu_stack = nn.Sequential(

nn.Linear(7, 512),

nn.ReLU(),

nn.Linear(512, 512),

nn.ReLU(),

nn.Linear(512, 19),

)

```
def forward(self, x):
x = self.flatten(x)
logits = self.linear_relu_stack(x)
return logits
```

model = NeuralNetwork()

def train_loop(dataloader, model, loss_fn, optimizer):

size = len(dataloader.dataset)

print(f’size = {size}‘);

for batch, (X, y) in enumerate(dataloader):

# Compute prediction and loss

pred = model(X)

print(f’pred = {pred}’);

print(f’y = {y}‘);

loss = loss_fn(pred, y)

print(f’loss = {loss}’);

# Backpropagation

optimizer.zero_grad()

loss.backward()

optimizer.step()

```
if batch % 100 == 0:
loss, current = loss.item(), batch * len(X)
print(f"loss: {loss:>7f} [{current:>5d}/{size:>5d}]")
```

learning_rate = 0.01

batch_size = 10

def test_loop(dataloader, model, loss_fn):

size = len(dataloader.dataset)

num_batches = len(dataloader)

test_loss, correct = 0, 0

```
with torch.no_grad():
for X, y in dataloader:
pred = model(X)
test_loss += loss_fn(pred, y).item()
correct += (pred.argmax(1) == y).type(torch.float).sum().item()
test_loss /= num_batches
correct /= size
print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")
```

loss_fn = nn.CrossEntropyLoss()

optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)