Unable to reduce the average classification loss on test data of a spectrogram classifier

Hello, I am working on a school project, it’s a spectrogramm (Images of a bird singing) classifier using with an autoencoder

I made a first ‘basic’ version which worked well but was only 63% accurate. By working on it again, I’m now approaching 80%, which I’m happy with. The only remaining problem is that my average loss on test data (classification) increases with each epoch instead of decreasing.

I’ve already tried adding dropouts and changing the rates, reworking my data by increasing it and adding noise, I’ve also added a weight_decay and a scheduler. The hyperparameters seem correct for my case. I don’t know what to do because no matter what I do, the result remains the same: the loss increases…
Could someone please help me?

Here is my code with some last results with a dropout at 0.6 and my comments are in French(I don’t have the result for the 0.5 version, who is the best, but it’s basically close) :


import torch
import torch.nn as nn
import torch.optim as optim
import torchvision.transforms as transforms
from torchvision import datasets
from torch.utils.data import DataLoader
import matplotlib.pyplot as plt

Charge les données

train_spectrogram_directory = r’C:\Users\bnuof\AC_data\MINOR\3_control\TRAIN_4control’
test_spectrogram_directory = r’C:\Users\bnuof\AC_data\MINOR\Donnees_jamais_vu_non_augmentee\TEST_NS_10’
print(“Données chargées”)

Transfère les données sur le GPU s’il est disponible

device = torch.device(“cuda:0” if torch.cuda.is_available() else “cpu”)
print(device)

Défini les transformations pour les données

transform = transforms.Compose([
transforms.Grayscale(num_output_channels=1), # Converti en nuances de gris
transforms.Resize((256, 256)), # Redimensionne en 256x256
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,)),
])

Charge les données d’entraînement

train_dataset = datasets.ImageFolder(root=train_spectrogram_directory, transform=transform)
trainloader = DataLoader(train_dataset, batch_size=32, shuffle=True, num_workers=2)
print(“Données d’entraînement chargées”)

Compte le nombre d’exemples pour chaque classe dans les données d’entraînement

class_counts_train = {}
for _, labels in trainloader:
for label in labels:
label = label.item()
if label in class_counts_train:
class_counts_train[label] += 1
else:
class_counts_train[label] = 1

print(“Répartition des classes dans les données d’entraînement :”)
print(class_counts_train)

Charge les données de test

test_dataset = datasets.ImageFolder(root=test_spectrogram_directory, transform=transform)
testloader = DataLoader(test_dataset, batch_size=32, shuffle=False, num_workers=2)
print(“Données de test chargées”)

Compte le nombre d’exemples pour chaque classe dans les données de test

class_counts_test = {}
for _, labels in testloader:
for label in labels:
label = label.item()
if label in class_counts_test:
class_counts_test[label] += 1
else:
class_counts_test[label] = 1

print(“Répartition des classes dans les données de test :”)
print(class_counts_test)

Vérifie le format des labels dans les données d’entraînement

for i, data in enumerate(trainloader, 0):
inputs, labels = data
print(f"Labels échantillon d’entraînement {i}: {labels}")
break

Vérifie le format des labels dans les données de test

for i, data in enumerate(testloader, 0):
inputs, labels = data
print(f"Labels échantillon de test {i}: {labels}")
break

Défini un autoencodeur avec un classifieur

class AutoencoderClassifier(nn.Module):
def init(self):
super(AutoencoderClassifier, self).init()
self.encoder = nn.Sequential(
nn.Conv2d(1, 32, kernel_size=3, padding=1),
nn.ReLU(),
nn.MaxPool2d(2, 2),
nn.Conv2d(32, 64, kernel_size=3, padding=1),
nn.ReLU(),
nn.MaxPool2d(2, 2),
nn.Conv2d(64, 128, kernel_size=3, padding=1),
nn.ReLU(),
nn.MaxPool2d(2, 2)
)
self.decoder = nn.Sequential(
nn.ConvTranspose2d(128, 64, kernel_size=2, stride=2),
nn.ReLU(),
nn.ConvTranspose2d(64, 32, kernel_size=2, stride=2),
nn.ReLU(),
nn.ConvTranspose2d(32, 1, kernel_size=2, stride=2),
nn.Sigmoid()
)
self.classifier = nn.Sequential(
nn.Linear(128 * 32 * 32, 256),
nn.ReLU(),
nn.Dropout(0.6),
nn.Linear(256, 128), # Ajout d’une couche linéaire avec dropout
nn.ReLU(),
nn.Dropout(0.6), # Ajout de dropout
nn.Linear(128, 64),
nn.ReLU(),
nn.Dropout(0.6), # Ajout de dropout
nn.Linear(64, 2) # 2 classes
)

def forward(self, x):
    # print(f"Input shape: {x.shape}")
    encoded = self.encoder(x)
     # print(f"Encoded shape: {encoded.shape}")
    encoded_flattened = encoded.view(encoded.size(0), -1)  # Aplatir l'encodage
    # print(f"Flattened shape: {encoded_flattened.shape}")
    output = self.classifier(encoded_flattened)
    reconstructed = self.decoder(encoded)
    return output, reconstructed

autoencoder_classifier = AutoencoderClassifier().to(device)
print(“Autoencodeur avec classifieur défini”)

Défini l’optimiseur et la fonction de perte

criterion_reconstruction = nn.MSELoss()
criterion_classification = nn.CrossEntropyLoss()
optimizer = optim.Adam(autoencoder_classifier.parameters(), lr=0.001, weight_decay=1e-4)
scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1) # Diminution du taux d’apprentissage
print(“Optimiseur et fonction de perte défini”)

Traçage des courbes

train_reconstruction_losses =
train_classification_losses =
test_reconstruction_losses =
test_classification_losses =
test_accuracies =

Entraîne l’autoencodeur avec classifieur

num_epochs = 30
for epoch in range(num_epochs):
autoencoder_classifier.train()
running_loss_reconstruction = 0.0
running_loss_classification = 0.0

for i, data in enumerate(trainloader, 0):
    inputs, labels = data
    inputs, labels = inputs.to(device), labels.to(device)

    optimizer.zero_grad()
    classifications, reconstructions = autoencoder_classifier(inputs)

    # Loss pour la reconstruction
    loss_reconstruction = criterion_reconstruction(reconstructions, inputs)

    # Loss pour la classification
    loss_classification = criterion_classification(classifications, labels)

    # La perte totale est la somme des deux
    loss = loss_reconstruction + loss_classification

    loss.backward()
    optimizer.step()
    
    running_loss_reconstruction += loss_reconstruction.item()
    running_loss_classification += loss_classification.item()

train_reconstruction_losses.append(running_loss_reconstruction / len(trainloader))
train_classification_losses.append(running_loss_classification / len(trainloader))

scheduler.step()  # Appliquer la diminution du taux d'apprentissage à chaque epoch

print(f"Epoch {epoch + 1}, Reconstruction Loss: {train_reconstruction_losses[-1]}, "
      f"Classification Loss: {train_classification_losses[-1]}")

# Évalue l'autoencodeur avec classifieur sur les données de test
autoencoder_classifier.eval()
total_loss_reconstruction = 0.0
total_loss_classification = 0.0
correct = 0
total = 0

with torch.no_grad():
    for i, data in enumerate(testloader):
        inputs, labels = data
        inputs, labels = inputs.to(device), labels.to(device)

        classifications, reconstructions = autoencoder_classifier(inputs)

        # Loss pour la reconstruction
        loss_reconstruction = criterion_reconstruction(reconstructions, inputs)

        # Loss pour la classification
        loss_classification = criterion_classification(classifications, labels)

        total_loss_reconstruction += loss_reconstruction.item()
        total_loss_classification += loss_classification.item()

        _, predicted = torch.max(classifications.data, 1)

        total += labels.size(0)
        correct += (predicted == labels).sum().item()

test_reconstruction_losses.append(total_loss_reconstruction / len(testloader))
test_classification_losses.append(total_loss_classification / len(testloader))
test_accuracy = 100 * correct / total
test_accuracies.append(test_accuracy)

print(f"Test Reconstruction Loss: {test_reconstruction_losses[-1]}, "
f"Test Classification Loss: {test_classification_losses[-1]}, "
f"Test Accuracy: {test_accuracies[-1]}%")

Calcul de la perte moyenne de reconstruction sur les données de test

average_loss_reconstruction_test = sum(test_reconstruction_losses) / len(test_reconstruction_losses)

Calcul de la perte moyenne de classification sur les données de test

average_loss_classification_test = sum(test_classification_losses) / len(test_classification_losses)

Affichage des résultats

print(f"Perte moyenne (Reconstruction) sur les données de test : {average_loss_reconstruction_test:.4f}“)
print(f"Perte moyenne (Classification) sur les données de test : {average_loss_classification_test:.4f}”)
print(f"Précision sur les données de test : {test_accuracies[-1]:.2f}%")

print(“Entraînement terminé”)

Plotting

plt.figure(figsize=(12, 5))
plt.subplot(1, 3, 1)
plt.plot(train_reconstruction_losses, label=‘Training’)
plt.plot(test_reconstruction_losses, label=‘Testing’)
plt.title(‘Reconstruction Loss’)
plt.xlabel(‘Epochs’)
plt.ylabel(‘Loss’)
plt.legend()

plt.subplot(1, 3, 2)
plt.plot(train_classification_losses, label=‘Training’)
plt.plot(test_classification_losses, label=‘Testing’)
plt.title(‘Classification Loss’)
plt.xlabel(‘Epochs’)
plt.ylabel(‘Loss’)
plt.legend()

plt.subplot(1, 3, 3)
plt.plot(test_accuracies, label=‘Accuracy’)
plt.title(‘Accuracy’)
plt.xlabel(‘Epochs’)
plt.ylabel(‘Accuracy (%)’)
plt.legend()

plt.tight_layout()
plt.show()


Results :

Répartition des classes dans les données d’entraînement :
{0: 38486, 1: 35490}
Données de test chargées
Répartition des classes dans les données de test :
{0: 612, 1: 570}
Labels échantillon d’entraînement 0: tensor([0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 1, 1, 1, 1])
Labels échantillon de test 0: tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0])
Autoencodeur avec classifieur défini
Optimiseur et fonction de perte défini
Epoch 1, Reconstruction Loss: 0.6335868493856855, Classification Loss: 0.6942900905972121
Test Reconstruction Loss: 0.8886207245491646, Test Classification Loss: 0.6928663157127999, Test Accuracy: 51.776649746192895%
Epoch 2, Reconstruction Loss: 0.6114840523749074, Classification Loss: 0.6926463826150218
Test Reconstruction Loss: 0.888644730722582, Test Classification Loss: 0.6928839731860805, Test Accuracy: 51.776649746192895%
Epoch 3, Reconstruction Loss: 0.6115466533132078, Classification Loss: 0.5503022932128458
Test Reconstruction Loss: 0.8887606907535244, Test Classification Loss: 0.7935993063792184, Test Accuracy: 72.165820642978%
Epoch 4, Reconstruction Loss: 0.6115272873206947, Classification Loss: 0.18059742770272155
Test Reconstruction Loss: 0.888761043548584, Test Classification Loss: 0.7707675378266219, Test Accuracy: 78.7648054145516%
Epoch 5, Reconstruction Loss: 0.6115277723302891, Classification Loss: 0.08859755201891603
Test Reconstruction Loss: 0.8887654059642071, Test Classification Loss: 1.495634292044702, Test Accuracy: 75.38071065989848%
Epoch 6, Reconstruction Loss: 0.6115313859650008, Classification Loss: 0.06489307850087807
Test Reconstruction Loss: 0.8887686826087333, Test Classification Loss: 1.2488370343690385, Test Accuracy: 76.90355329949239%
Epoch 7, Reconstruction Loss: 0.6115346698921857, Classification Loss: 0.05467323608003281
Test Reconstruction Loss: 0.8887685827306799, Test Classification Loss: 1.4190230249251063, Test Accuracy: 76.14213197969544%
Epoch 8, Reconstruction Loss: 0.6115333565017756, Classification Loss: 0.05111180777221685
Test Reconstruction Loss: 0.8887706398963928, Test Classification Loss: 1.5856978585188453, Test Accuracy: 75.29610829103216%
Epoch 9, Reconstruction Loss: 0.6115317751482696, Classification Loss: 0.04917332645772334
Test Reconstruction Loss: 0.8887769225481394, Test Classification Loss: 1.2307323956818044, Test Accuracy: 78.42639593908629%
Epoch 10, Reconstruction Loss: 0.6115332701111335, Classification Loss: 0.04446662263720676
Test Reconstruction Loss: 0.8887770643105378, Test Classification Loss: 1.6570626085473072, Test Accuracy: 74.78849407783417%
Epoch 11, Reconstruction Loss: 0.6115354440822733, Classification Loss: 0.023080931643062127
Test Reconstruction Loss: 0.8887690225163022, Test Classification Loss: 1.8061581185871005, Test Accuracy: 77.49576988155668%
Epoch 12, Reconstruction Loss: 0.6115340192425209, Classification Loss: 0.015943363448663476
Test Reconstruction Loss: 0.8887694220285158, Test Classification Loss: 1.8344324903745153, Test Accuracy: 78.00338409475465%
Epoch 13, Reconstruction Loss: 0.6115319679094846, Classification Loss: 0.01413560981308865
Test Reconstruction Loss: 0.8887705835136207, Test Classification Loss: 1.7979167628643178, Test Accuracy: 77.91878172588832%
Epoch 14, Reconstruction Loss: 0.6115347187463388, Classification Loss: 0.012314112992770281
Test Reconstruction Loss: 0.8887674582971109, Test Classification Loss: 2.072312481841387, Test Accuracy: 77.07275803722504%
Epoch 15, Reconstruction Loss: 0.6115364242332204, Classification Loss: 0.012058232381769627
Test Reconstruction Loss: 0.8887691836099367, Test Classification Loss: 2.0831531397612904, Test Accuracy: 78.34179357021996%
Epoch 16, Reconstruction Loss: 0.6115351645694884, Classification Loss: 0.011154706385839236
Test Reconstruction Loss: 0.8887724828075718, Test Classification Loss: 2.031410675019976, Test Accuracy: 77.49576988155668%
Epoch 17, Reconstruction Loss: 0.6115344361656677, Classification Loss: 0.011016909486063963
Test Reconstruction Loss: 0.8887700422390087, Test Classification Loss: 2.1606039143387386, Test Accuracy: 76.90355329949239%
Epoch 18, Reconstruction Loss: 0.6115348042088808, Classification Loss: 0.011500677065773004
Test Reconstruction Loss: 0.8887685617885074, Test Classification Loss: 2.160907399741936, Test Accuracy: 78.08798646362098%
Epoch 19, Reconstruction Loss: 0.6115356631912162, Classification Loss: 0.009560889692374671
Test Reconstruction Loss: 0.8887708734821629, Test Classification Loss: 2.108197090019727, Test Accuracy: 77.91878172588832%
Epoch 20, Reconstruction Loss: 0.6115323838529703, Classification Loss: 0.010052478367817778
Test Reconstruction Loss: 0.888770221052943, Test Classification Loss: 2.2418573479879194, Test Accuracy: 78.00338409475465%
Epoch 21, Reconstruction Loss: 0.6115348307112921, Classification Loss: 0.007913808090644474
Test Reconstruction Loss: 0.8887694284722611, Test Classification Loss: 2.2238873629583003, Test Accuracy: 78.08798646362098%
Epoch 22, Reconstruction Loss: 0.6115349755464541, Classification Loss: 0.007035895414091884
Test Reconstruction Loss: 0.8887704449730951, Test Classification Loss: 2.173383371261455, Test Accuracy: 78.1725888324873%
Epoch 23, Reconstruction Loss: 0.6115339775295818, Classification Loss: 0.0063489577133830945
Test Reconstruction Loss: 0.8887705126324216, Test Classification Loss: 2.3319979521970584, Test Accuracy: 78.59560067681895%
Epoch 24, Reconstruction Loss: 0.6115327943309781, Classification Loss: 0.006731305910551811
Test Reconstruction Loss: 0.8887703080435057, Test Classification Loss: 2.2841661384534104, Test Accuracy: 78.08798646362098%
Epoch 25, Reconstruction Loss: 0.6115364989710514, Classification Loss: 0.006315599890777464
Test Reconstruction Loss: 0.8887699198078465, Test Classification Loss: 2.196653267764329, Test Accuracy: 78.42639593908629%
Epoch 26, Reconstruction Loss: 0.6115336026287409, Classification Loss: 0.006185977677432675
Test Reconstruction Loss: 0.8887700873452264, Test Classification Loss: 2.3017566570708023, Test Accuracy: 78.51099830795262%
Epoch 27, Reconstruction Loss: 0.6115318661020701, Classification Loss: 0.0058691332596051155
Test Reconstruction Loss: 0.8887697941548115, Test Classification Loss: 2.2618620617627254, Test Accuracy: 78.34179357021996%
Epoch 28, Reconstruction Loss: 0.6115345039427487, Classification Loss: 0.004803925938737453
Test Reconstruction Loss: 0.8887704836355673, Test Classification Loss: 2.479979938968894, Test Accuracy: 78.1725888324873%
Epoch 29, Reconstruction Loss: 0.6115342330406693, Classification Loss: 0.005874324455518684
Test Reconstruction Loss: 0.8887699375281463, Test Classification Loss: 2.3927476492761444, Test Accuracy: 78.1725888324873%
Epoch 30, Reconstruction Loss: 0.6115320019140376, Classification Loss: 0.006373061968985759
Test Reconstruction Loss: 0.8887699794124913, Test Classification Loss: 2.268828430585632, Test Accuracy: 78.25719120135363%
Perte moyenne (Reconstruction) sur les données de test : 0.8888
Perte moyenne (Classification) sur les données de test : 1.8266
Précision sur les données de test : 78.26%