Siamese Network: Prediction Accuracy and Training Loss

Using a single CNN to make inference on my dataset trains as expected with around 85% accuracy. I wanted to implement a siamese network to see if this could make any improvements on the accuracy. However, the training accuracy just fluctuates from 45% top 59% and neither the training loss or test loss seem to move from the initial value.

I tried using contrastive loss, but didn’t know how to get an accuracy with this method, so I switched to using BCELoss. The motivation for using BCE, instead of constrastive loss, came from here. Below is my CNN model I modified for the siamese network

class ConvNet(nn.Module):
    def __init__(self):
        super(ConvNet, self).__init__()
        self.conv1 = nn.Conv2d(in_channels = 1, out_channels = 32, kernel_size = 5)
        self.pool = nn.MaxPool2d(2,2)
        self.conv2 = nn.Conv2d(in_channels = 32, out_channels = 64, kernel_size = 3)
        self.conv3 = nn.Conv2d(in_channels = 64, out_channels = 128, kernel_size = 3)
        self.fc_drop = nn.Dropout(p = 0.5)

        #global average pooling layer
        self.gap = nn.AdaptiveAvgPool2d((20,1))
        self.fc1 = nn.Linear(128*20, 512)
        self.fc2 = nn.Linear(512, 128)
        self.fc3 = nn.Linear(256, 1)

    def forward_once(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = self.pool(F.relu(self.conv3(x)))

        # prep for linear layer
        # flattening
        # x = x.view(x.size(0), -1)
        x = self.gap(x)
        x = x.view(x.size(0), -1)
        # two linear layers with dropout in between
        x = F.relu(self.fc1(x))
        x = self.fc_drop(x)
        x = F.relu(self.fc2(x))
        # x = self.fc_drop(x)
        # x = self.fc3(x)
        return x


    def forward(self, in1, in2):
        out1 = self.forward_once(in1)
        out2 = self.forward_once(in2)
        out1 = self.fc_drop(out1)
        out2 = self.fc_drop(out2)
        x = torch.cat((out1,out2),1)
        x = self.fc3(x)
        x = torch.sigmoid(x)
        return x

## TRAINING
model = ConvNet()
# Loss and optimizer
# criterion = ContrastiveLoss()
criterion = nn.BCELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate, momentum = 0.9)

# Train the model
for epoch in range(num_epochs):
    accuracy = 0
    running_loss = 0
    for images1, images2, labels in train_loader:
        # Run the forward pass
        outputs = model(images1, images2)
        labels = labels.float()
        labels = labels.unsqueeze(1)
        loss = criterion(outputs, labels)

        loss_list.append(loss.item())

        # Backprop and perform SGD optimisation
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
    else:
        test_loss = 0
        accuracy = 0
        correct = 0

        with torch.no_grad():
            model.eval()
            for images1, images2, labels in test_loader:
                outputs = model(images1, images2)
                labels = labels.float()
                labels = labels.unsqueeze(1)
                total = labels.size(0)
                test_loss += criterion(outputs, labels)
                for j in range(outputs.size()[0]):
                    if outputs[j] <= 0.5 and labels[j] == 0:
                        correct += 1
                    if outputs[j] > 0.5 and labels[j] == 1:
                        correct += 1
                accuracy += correct/total
                correct = 0
                acc_list.append(correct / total)
        model.train()

And the output is:
Epoch: 1/500… Training Loss: 0.693… Test Loss: 0.692… Test Accuracy: 0.562
Epoch: 2/500… Training Loss: 0.694… Test Loss: 0.693… Test Accuracy: 0.508
Epoch: 3/500… Training Loss: 0.694… Test Loss: 0.694… Test Accuracy: 0.453
Epoch: 4/500… Training Loss: 0.694… Test Loss: 0.692… Test Accuracy: 0.535
Epoch: 5/500… Training Loss: 0.694… Test Loss: 0.693… Test Accuracy: 0.480
Epoch: 6/500… Training Loss: 0.694… Test Loss: 0.692… Test Accuracy: 0.535
Epoch: 7/500… Training Loss: 0.694… Test Loss: 0.693… Test Accuracy: 0.480
Epoch: 8/500… Training Loss: 0.694… Test Loss: 0.692… Test Accuracy: 0.535
Epoch: 9/500… Training Loss: 0.693… Test Loss: 0.692… Test Accuracy: 0.535
Epoch: 10/500… Training Loss: 0.693… Test Loss: 0.692… Test Accuracy: 0.562
Epoch: 11/500… Training Loss: 0.693… Test Loss: 0.693… Test Accuracy: 0.480

Is there any glaringly obvious mistakes in my code or my approach that you can see? Any input is greatly appreciated. Apologies for any formatting errors. First time using this discussion board.

I cannot see any obvious errors in the code and would recommend to try to overfit a small data sample (e.g. just 10 samples) and make sure your new model is able to do so.
Based on the architecture, you might need to lower the learning rate for the common layers, as their gradients might be larger in their magnitude.

Thanks for the suggestion. I tried overfitting the model on a small data set and it wasn’t able to do so. I also tried using adam instead of SGD, but that didnt change anything.

How can i change my architecture to actually make this network learn?

I would start with a very simple model (e.g. just conv - pool - relu - linear) and try to overfit a single sample. If this isn’t even working then your training code might have a bug I couldn’t see so far.

I removed the drop out layer in my forward pass and this allowed my model to train on the training set, however it still instantly overfits to test set

def forward(self, in1, in2):
        out1 = self.forward_once(in1)
        out2 = self.forward_once(in2)
        # out1 = self.fc_drop(out1)
        # out2 = self.fc_drop(out2)
        x = torch.cat((out1,out2),1)
        x = self.fc3(x)
        x = torch.sigmoid(x)
        return x

Epoch: 1/500..  Training Loss: 0.692..  Train Accuracy: 0.520.. Test Loss: 0.691..  Test Accuracy: 0.548
Epoch: 2/500..  Training Loss: 0.692..  Train Accuracy: 0.518.. Test Loss: 0.690..  Test Accuracy: 0.571
Epoch: 3/500..  Training Loss: 0.692..  Train Accuracy: 0.518.. Test Loss: 0.689..  Test Accuracy: 0.571
Epoch: 4/500..  Training Loss: 0.693..  Train Accuracy: 0.513.. Test Loss: 0.687..  Test Accuracy: 0.583
Epoch: 5/500..  Training Loss: 0.692..  Train Accuracy: 0.518.. Test Loss: 0.688..  Test Accuracy: 0.583
Epoch: 6/500..  Training Loss: 0.691..  Train Accuracy: 0.520.. Test Loss: 0.688..  Test Accuracy: 0.583
Epoch: 7/500..  Training Loss: 0.691..  Train Accuracy: 0.520.. Test Loss: 0.689..  Test Accuracy: 0.560
Epoch: 8/500..  Training Loss: 0.692..  Train Accuracy: 0.515.. Test Loss: 0.686..  Test Accuracy: 0.595
Epoch: 9/500..  Training Loss: 0.690..  Train Accuracy: 0.515.. Test Loss: 0.689..  Test Accuracy: 0.560
Epoch: 10/500..  Training Loss: 0.691..  Train Accuracy: 0.515.. Test Loss: 0.688..  Test Accuracy: 0.571
Epoch: 11/500..  Training Loss: 0.691..  Train Accuracy: 0.518.. Test Loss: 0.687..  Test Accuracy: 0.583
Epoch: 12/500..  Training Loss: 0.690..  Train Accuracy: 0.518.. Test Loss: 0.687..  Test Accuracy: 0.583
Epoch: 13/500..  Training Loss: 0.689..  Train Accuracy: 0.520.. Test Loss: 0.688..  Test Accuracy: 0.571
Epoch: 14/500..  Training Loss: 0.690..  Train Accuracy: 0.519.. Test Loss: 0.687..  Test Accuracy: 0.583
Epoch: 15/500..  Training Loss: 0.691..  Train Accuracy: 0.513.. Test Loss: 0.689..  Test Accuracy: 0.536
Epoch: 16/500..  Training Loss: 0.688..  Train Accuracy: 0.542.. Test Loss: 0.688..  Test Accuracy: 0.548
Epoch: 17/500..  Training Loss: 0.689..  Train Accuracy: 0.519.. Test Loss: 0.687..  Test Accuracy: 0.548
Epoch: 18/500..  Training Loss: 0.688..  Train Accuracy: 0.538.. Test Loss: 0.687..  Test Accuracy: 0.548
Epoch: 19/500..  Training Loss: 0.688..  Train Accuracy: 0.549.. Test Loss: 0.687..  Test Accuracy: 0.548
Epoch: 20/500..  Training Loss: 0.687..  Train Accuracy: 0.571.. Test Loss: 0.687..  Test Accuracy: 0.548
Epoch: 21/500..  Training Loss: 0.687..  Train Accuracy: 0.564.. Test Loss: 0.686..  Test Accuracy: 0.548
Epoch: 22/500..  Training Loss: 0.687..  Train Accuracy: 0.529.. Test Loss: 0.686..  Test Accuracy: 0.548
Epoch: 23/500..  Training Loss: 0.686..  Train Accuracy: 0.546.. Test Loss: 0.684..  Test Accuracy: 0.560
Epoch: 24/500..  Training Loss: 0.685..  Train Accuracy: 0.586.. Test Loss: 0.684..  Test Accuracy: 0.595
Epoch: 25/500..  Training Loss: 0.684..  Train Accuracy: 0.586.. Test Loss: 0.688..  Test Accuracy: 0.619
Epoch: 26/500..  Training Loss: 0.686..  Train Accuracy: 0.571.. Test Loss: 0.686..  Test Accuracy: 0.548
Epoch: 27/500..  Training Loss: 0.680..  Train Accuracy: 0.620.. Test Loss: 0.687..  Test Accuracy: 0.571
Epoch: 28/500..  Training Loss: 0.679..  Train Accuracy: 0.593.. Test Loss: 0.693..  Test Accuracy: 0.536
Epoch: 29/500..  Training Loss: 0.677..  Train Accuracy: 0.594.. Test Loss: 0.689..  Test Accuracy: 0.595
Epoch: 30/500..  Training Loss: 0.678..  Train Accuracy: 0.586.. Test Loss: 0.681..  Test Accuracy: 0.560
Epoch: 31/500..  Training Loss: 0.674..  Train Accuracy: 0.630.. Test Loss: 0.684..  Test Accuracy: 0.643
Epoch: 32/500..  Training Loss: 0.671..  Train Accuracy: 0.621.. Test Loss: 0.692..  Test Accuracy: 0.548
Epoch: 33/500..  Training Loss: 0.694..  Train Accuracy: 0.527.. Test Loss: 0.682..  Test Accuracy: 0.583
Epoch: 34/500..  Training Loss: 0.672..  Train Accuracy: 0.623.. Test Loss: 0.714..  Test Accuracy: 0.440
Epoch: 35/500..  Training Loss: 0.675..  Train Accuracy: 0.570.. Test Loss: 0.695..  Test Accuracy: 0.548
Epoch: 36/500..  Training Loss: 0.670..  Train Accuracy: 0.619.. Test Loss: 0.683..  Test Accuracy: 0.643
Epoch: 37/500..  Training Loss: 0.668..  Train Accuracy: 0.600.. Test Loss: 0.690..  Test Accuracy: 0.571
Epoch: 38/500..  Training Loss: 0.662..  Train Accuracy: 0.577.. Test Loss: 0.695..  Test Accuracy: 0.548
Epoch: 39/500..  Training Loss: 0.668..  Train Accuracy: 0.564.. Test Loss: 0.698..  Test Accuracy: 0.571
Epoch: 40/500..  Training Loss: 0.658..  Train Accuracy: 0.624.. Test Loss: 0.702..  Test Accuracy: 0.583
Epoch: 41/500..  Training Loss: 0.660..  Train Accuracy: 0.600.. Test Loss: 0.723..  Test Accuracy: 0.452
Epoch: 42/500..  Training Loss: 0.676..  Train Accuracy: 0.607.. Test Loss: 0.706..  Test Accuracy: 0.583
Epoch: 43/500..  Training Loss: 0.646..  Train Accuracy: 0.650.. Test Loss: 0.736..  Test Accuracy: 0.452
Epoch: 44/500..  Training Loss: 0.668..  Train Accuracy: 0.590.. Test Loss: 0.704..  Test Accuracy: 0.595
Epoch: 45/500..  Training Loss: 0.645..  Train Accuracy: 0.682.. Test Loss: 0.704..  Test Accuracy: 0.607
Epoch: 46/500..  Training Loss: 0.653..  Train Accuracy: 0.635.. Test Loss: 0.737..  Test Accuracy: 0.512
Epoch: 47/500..  Training Loss: 0.655..  Train Accuracy: 0.600.. Test Loss: 0.718..  Test Accuracy: 0.560
Epoch: 48/500..  Training Loss: 0.654..  Train Accuracy: 0.610.. Test Loss: 0.720..  Test Accuracy: 0.595
Epoch: 49/500..  Training Loss: 0.635..  Train Accuracy: 0.644.. Test Loss: 0.717..  Test Accuracy: 0.571
Epoch: 50/500..  Training Loss: 0.646..  Train Accuracy: 0.608.. Test Loss: 0.729..  Test Accuracy: 0.583
Epoch: 51/500..  Training Loss: 0.625..  Train Accuracy: 0.645.. Test Loss: 0.747..  Test Accuracy: 0.583
Epoch: 52/500..  Training Loss: 0.648..  Train Accuracy: 0.617.. Test Loss: 0.755..  Test Accuracy: 0.524
Epoch: 53/500..  Training Loss: 0.638..  Train Accuracy: 0.633.. Test Loss: 0.770..  Test Accuracy: 0.571
Epoch: 54/500..  Training Loss: 0.650..  Train Accuracy: 0.623.. Test Loss: 0.766..  Test Accuracy: 0.488
Epoch: 55/500..  Training Loss: 0.686..  Train Accuracy: 0.523.. Test Loss: 0.730..  Test Accuracy: 0.512
Epoch: 56/500..  Training Loss: 0.679..  Train Accuracy: 0.590.. Test Loss: 0.699..  Test Accuracy: 0.548
Epoch: 57/500..  Training Loss: 0.651..  Train Accuracy: 0.615.. Test Loss: 0.693..  Test Accuracy: 0.619
Epoch: 58/500..  Training Loss: 0.645..  Train Accuracy: 0.602.. Test Loss: 0.699..  Test Accuracy: 0.488
Epoch: 59/500..  Training Loss: 0.650..  Train Accuracy: 0.631.. Test Loss: 0.693..  Test Accuracy: 0.655
Epoch: 60/500..  Training Loss: 0.635..  Train Accuracy: 0.636.. Test Loss: 0.702..  Test Accuracy: 0.595
Epoch: 61/500..  Training Loss: 0.631..  Train Accuracy: 0.664.. Test Loss: 0.721..  Test Accuracy: 0.631
Epoch: 62/500..  Training Loss: 0.624..  Train Accuracy: 0.673.. Test Loss: 0.742..  Test Accuracy: 0.619
Epoch: 63/500..  Training Loss: 0.619..  Train Accuracy: 0.677.. Test Loss: 0.766..  Test Accuracy: 0.595
Epoch: 64/500..  Training Loss: 0.615..  Train Accuracy: 0.685.. Test Loss: 0.798..  Test Accuracy: 0.560
Epoch: 65/500..  Training Loss: 0.619..  Train Accuracy: 0.664.. Test Loss: 0.762..  Test Accuracy: 0.595
Epoch: 66/500..  Training Loss: 0.603..  Train Accuracy: 0.675.. Test Loss: 0.802..  Test Accuracy: 0.524
Epoch: 67/500..  Training Loss: 0.646..  Train Accuracy: 0.580.. Test Loss: 0.778..  Test Accuracy: 0.536
Epoch: 68/500..  Training Loss: 0.605..  Train Accuracy: 0.720.. Test Loss: 0.761..  Test Accuracy: 0.548
Epoch: 69/500..  Training Loss: 0.621..  Train Accuracy: 0.636.. Test Loss: 0.771..  Test Accuracy: 0.607
Epoch: 70/500..  Training Loss: 0.628..  Train Accuracy: 0.688.. Test Loss: 0.834..  Test Accuracy: 0.452
Epoch: 71/500..  Training Loss: 0.618..  Train Accuracy: 0.654.. Test Loss: 0.791..  Test Accuracy: 0.571
Epoch: 72/500..  Training Loss: 0.584..  Train Accuracy: 0.706.. Test Loss: 0.766..  Test Accuracy: 0.512
Epoch: 73/500..  Training Loss: 0.629..  Train Accuracy: 0.658.. Test Loss: 0.793..  Test Accuracy: 0.536
Epoch: 74/500..  Training Loss: 0.634..  Train Accuracy: 0.619.. Test Loss: 0.759..  Test Accuracy: 0.500
Epoch: 75/500..  Training Loss: 0.582..  Train Accuracy: 0.721.. Test Loss: 0.786..  Test Accuracy: 0.607
Epoch: 76/500..  Training Loss: 0.611..  Train Accuracy: 0.665.. Test Loss: 0.807..  Test Accuracy: 0.607
Epoch: 77/500..  Training Loss: 0.618..  Train Accuracy: 0.639.. Test Loss: 0.819..  Test Accuracy: 0.571
Epoch: 78/500..  Training Loss: 0.614..  Train Accuracy: 0.664.. Test Loss: 0.759..  Test Accuracy: 0.512
Epoch: 79/500..  Training Loss: 0.610..  Train Accuracy: 0.701.. Test Loss: 0.757..  Test Accuracy: 0.500
Epoch: 80/500..  Training Loss: 0.600..  Train Accuracy: 0.683.. Test Loss: 0.809..  Test Accuracy: 0.548
Epoch: 81/500..  Training Loss: 0.607..  Train Accuracy: 0.667.. Test Loss: 0.822..  Test Accuracy: 0.548
Epoch: 82/500..  Training Loss: 0.586..  Train Accuracy: 0.656.. Test Loss: 0.806..  Test Accuracy: 0.476
Epoch: 83/500..  Training Loss: 0.599..  Train Accuracy: 0.705.. Test Loss: 0.803..  Test Accuracy: 0.464
Epoch: 84/500..  Training Loss: 0.583..  Train Accuracy: 0.706.. Test Loss: 0.882..  Test Accuracy: 0.512
Epoch: 85/500..  Training Loss: 0.623..  Train Accuracy: 0.693.. Test Loss: 0.838..  Test Accuracy: 0.560
Epoch: 86/500..  Training Loss: 0.606..  Train Accuracy: 0.690.. Test Loss: 0.786..  Test Accuracy: 0.500
Epoch: 87/500..  Training Loss: 0.575..  Train Accuracy: 0.711.. Test Loss: 0.794..  Test Accuracy: 0.512
Epoch: 88/500..  Training Loss: 0.583..  Train Accuracy: 0.710.. Test Loss: 0.867..  Test Accuracy: 0.524
Epoch: 89/500..  Training Loss: 0.565..  Train Accuracy: 0.712.. Test Loss: 0.902..  Test Accuracy: 0.500
Epoch: 90/500..  Training Loss: 0.560..  Train Accuracy: 0.726.. Test Loss: 0.903..  Test Accuracy: 0.524
Epoch: 91/500..  Training Loss: 0.594..  Train Accuracy: 0.692.. Test Loss: 0.948..  Test Accuracy: 0.440
Epoch: 92/500..  Training Loss: 0.573..  Train Accuracy: 0.706.. Test Loss: 0.889..  Test Accuracy: 0.500
Epoch: 93/500..  Training Loss: 0.596..  Train Accuracy: 0.657.. Test Loss: 0.814..  Test Accuracy: 0.560
Epoch: 94/500..  Training Loss: 0.597..  Train Accuracy: 0.686.. Test Loss: 0.822..  Test Accuracy: 0.417
Epoch: 95/500..  Training Loss: 0.575..  Train Accuracy: 0.683.. Test Loss: 0.916..  Test Accuracy: 0.512
Epoch: 96/500..  Training Loss: 0.583..  Train Accuracy: 0.719.. Test Loss: 0.893..  Test Accuracy: 0.476
Epoch: 97/500..  Training Loss: 0.535..  Train Accuracy: 0.726.. Test Loss: 0.901..  Test Accuracy: 0.512
Epoch: 98/500..  Training Loss: 0.538..  Train Accuracy: 0.714.. Test Loss: 0.937..  Test Accuracy: 0.476
Epoch: 99/500..  Training Loss: 0.553..  Train Accuracy: 0.736.. Test Loss: 0.970..  Test Accuracy: 0.405
Epoch: 100/500..  Training Loss: 0.581..  Train Accuracy: 0.670.. Test Loss: 0.965..  Test Accuracy: 0.440
Epoch: 101/500..  Training Loss: 0.550..  Train Accuracy: 0.727.. Test Loss: 0.932..  Test Accuracy: 0.500
Epoch: 102/500..  Training Loss: 0.549..  Train Accuracy: 0.718.. Test Loss: 0.973..  Test Accuracy: 0.488
Epoch: 103/500..  Training Loss: 0.552..  Train Accuracy: 0.677.. Test Loss: 0.980..  Test Accuracy: 0.429
Epoch: 104/500..  Training Loss: 0.560..  Train Accuracy: 0.723.. Test Loss: 0.894..  Test Accuracy: 0.476
Epoch: 105/500..  Training Loss: 0.559..  Train Accuracy: 0.710.. Test Loss: 0.954..  Test Accuracy: 0.548
Epoch: 106/500..  Training Loss: 0.524..  Train Accuracy: 0.724.. Test Loss: 1.016..  Test Accuracy: 0.524
Epoch: 107/500..  Training Loss: 0.520..  Train Accuracy: 0.718.. Test Loss: 1.049..  Test Accuracy: 0.500
Epoch: 108/500..  Training Loss: 0.555..  Train Accuracy: 0.724.. Test Loss: 1.017..  Test Accuracy: 0.417
Epoch: 109/500..  Training Loss: 0.543..  Train Accuracy: 0.706.. Test Loss: 1.092..  Test Accuracy: 0.452
Epoch: 110/500..  Training Loss: 0.542..  Train Accuracy: 0.700.. Test Loss: 1.089..  Test Accuracy: 0.512
Epoch: 111/500..  Training Loss: 0.581..  Train Accuracy: 0.656.. Test Loss: 1.002..  Test Accuracy: 0.476
Epoch: 112/500..  Training Loss: 0.573..  Train Accuracy: 0.683.. Test Loss: 0.908..  Test Accuracy: 0.488
Epoch: 113/500..  Training Loss: 0.553..  Train Accuracy: 0.726.. Test Loss: 0.935..  Test Accuracy: 0.500
Epoch: 114/500..  Training Loss: 0.526..  Train Accuracy: 0.710.. Test Loss: 1.024..  Test Accuracy: 0.524
Epoch: 115/500..  Training Loss: 0.565..  Train Accuracy: 0.686.. Test Loss: 1.066..  Test Accuracy: 0.476
Epoch: 116/500..  Training Loss: 0.543..  Train Accuracy: 0.730.. Test Loss: 0.988..  Test Accuracy: 0.429
Epoch: 117/500..  Training Loss: 0.525..  Train Accuracy: 0.712.. Test Loss: 0.955..  Test Accuracy: 0.500
Epoch: 118/500..  Training Loss: 0.521..  Train Accuracy: 0.732.. Test Loss: 1.075..  Test Accuracy: 0.393
Epoch: 119/500..  Training Loss: 0.504..  Train Accuracy: 0.748.. Test Loss: 1.075..  Test Accuracy: 0.417
Epoch: 120/500..  Training Loss: 0.486..  Train Accuracy: 0.769.. Test Loss: 1.206..  Test Accuracy: 0.405
Epoch: 121/500..  Training Loss: 0.517..  Train Accuracy: 0.746.. Test Loss: 1.229..  Test Accuracy: 0.488
Epoch: 122/500..  Training Loss: 0.636..  Train Accuracy: 0.693.. Test Loss: 1.131..  Test Accuracy: 0.440
Epoch: 123/500..  Training Loss: 0.577..  Train Accuracy: 0.682.. Test Loss: 0.828..  Test Accuracy: 0.548
Epoch: 124/500..  Training Loss: 0.554..  Train Accuracy: 0.729.. Test Loss: 0.902..  Test Accuracy: 0.500
Epoch: 125/500..  Training Loss: 0.514..  Train Accuracy: 0.743.. Test Loss: 0.932..  Test Accuracy: 0.417
Epoch: 126/500..  Training Loss: 0.550..  Train Accuracy: 0.682.. Test Loss: 0.958..  Test Accuracy: 0.476
Epoch: 127/500..  Training Loss: 0.512..  Train Accuracy: 0.760.. Test Loss: 1.049..  Test Accuracy: 0.500
Epoch: 128/500..  Training Loss: 0.542..  Train Accuracy: 0.739.. Test Loss: 0.997..  Test Accuracy: 0.536
Epoch: 129/500..  Training Loss: 0.560..  Train Accuracy: 0.690.. Test Loss: 0.908..  Test Accuracy: 0.488
Epoch: 130/500..  Training Loss: 0.527..  Train Accuracy: 0.744.. Test Loss: 0.948..  Test Accuracy: 0.405
Epoch: 131/500..  Training Loss: 0.520..  Train Accuracy: 0.727.. Test Loss: 1.017..  Test Accuracy: 0.440
Epoch: 132/500..  Training Loss: 0.508..  Train Accuracy: 0.736.. Test Loss: 1.103..  Test Accuracy: 0.500
Epoch: 133/500..  Training Loss: 0.474..  Train Accuracy: 0.749.. Test Loss: 1.185..  Test Accuracy: 0.393
Epoch: 134/500..  Training Loss: 0.497..  Train Accuracy: 0.744.. Test Loss: 1.067..  Test Accuracy: 0.464
Epoch: 135/500..  Training Loss: 0.486..  Train Accuracy: 0.758.. Test Loss: 1.104..  Test Accuracy: 0.417
Epoch: 136/500..  Training Loss: 0.448..  Train Accuracy: 0.768.. Test Loss: 1.225..  Test Accuracy: 0.464
Epoch: 137/500..  Training Loss: 0.444..  Train Accuracy: 0.761.. Test Loss: 1.258..  Test Accuracy: 0.440
Epoch: 138/500..  Training Loss: 0.433..  Train Accuracy: 0.782.. Test Loss: 1.262..  Test Accuracy: 0.452
Epoch: 139/500..  Training Loss: 0.482..  Train Accuracy: 0.761.. Test Loss: 1.358..  Test Accuracy: 0.429
Epoch: 140/500..  Training Loss: 0.495..  Train Accuracy: 0.738.. Test Loss: 1.277..  Test Accuracy: 0.500
Epoch: 141/500..  Training Loss: 0.493..  Train Accuracy: 0.757.. Test Loss: 1.039..  Test Accuracy: 0.476
Epoch: 142/500..  Training Loss: 0.468..  Train Accuracy: 0.757.. Test Loss: 1.132..  Test Accuracy: 0.452
Epoch: 143/500..  Training Loss: 0.447..  Train Accuracy: 0.755.. Test Loss: 1.142..  Test Accuracy: 0.440
Epoch: 144/500..  Training Loss: 0.462..  Train Accuracy: 0.781.. Test Loss: 1.214..  Test Accuracy: 0.500
Epoch: 145/500..  Training Loss: 0.478..  Train Accuracy: 0.742.. Test Loss: 1.128..  Test Accuracy: 0.452
Epoch: 146/500..  Training Loss: 0.445..  Train Accuracy: 0.767.. Test Loss: 1.171..  Test Accuracy: 0.440
Epoch: 147/500..  Training Loss: 0.434..  Train Accuracy: 0.769.. Test Loss: 1.331..  Test Accuracy: 0.464
Epoch: 148/500..  Training Loss: 0.440..  Train Accuracy: 0.782.. Test Loss: 1.371..  Test Accuracy: 0.440
Epoch: 149/500..  Training Loss: 0.421..  Train Accuracy: 0.792.. Test Loss: 1.402..  Test Accuracy: 0.429
Epoch: 150/500..  Training Loss: 0.460..  Train Accuracy: 0.751.. Test Loss: 1.207..  Test Accuracy: 0.512
Epoch: 151/500..  Training Loss: 0.452..  Train Accuracy: 0.752.. Test Loss: 1.267..  Test Accuracy: 0.417
Epoch: 152/500..  Training Loss: 0.454..  Train Accuracy: 0.776.. Test Loss: 1.210..  Test Accuracy: 0.440
Epoch: 153/500..  Training Loss: 0.467..  Train Accuracy: 0.769.. Test Loss: 1.370..  Test Accuracy: 0.512
Epoch: 154/500..  Training Loss: 0.416..  Train Accuracy: 0.756.. Test Loss: 1.394..  Test Accuracy: 0.393
Epoch: 155/500..  Training Loss: 0.400..  Train Accuracy: 0.794.. Test Loss: 1.258..  Test Accuracy: 0.524
Epoch: 156/500..  Training Loss: 0.369..  Train Accuracy: 0.824.. Test Loss: 1.412..  Test Accuracy: 0.512
Epoch: 157/500..  Training Loss: 0.356..  Train Accuracy: 0.832.. Test Loss: 1.534..  Test Accuracy: 0.500
Epoch: 158/500..  Training Loss: 0.406..  Train Accuracy: 0.806.. Test Loss: 1.542..  Test Accuracy: 0.512
Epoch: 159/500..  Training Loss: 0.388..  Train Accuracy: 0.817.. Test Loss: 1.449..  Test Accuracy: 0.536
Epoch: 160/500..  Training Loss: 0.366..  Train Accuracy: 0.820.. Test Loss: 1.508..  Test Accuracy: 0.452
Epoch: 161/500..  Training Loss: 0.408..  Train Accuracy: 0.811.. Test Loss: 1.437..  Test Accuracy: 0.476
Epoch: 162/500..  Training Loss: 0.419..  Train Accuracy: 0.804.. Test Loss: 1.393..  Test Accuracy: 0.488
Epoch: 163/500..  Training Loss: 0.356..  Train Accuracy: 0.864.. Test Loss: 1.406..  Test Accuracy: 0.440
Epoch: 164/500..  Training Loss: 0.340..  Train Accuracy: 0.852.. Test Loss: 1.466..  Test Accuracy: 0.452
Epoch: 165/500..  Training Loss: 0.339..  Train Accuracy: 0.855.. Test Loss: 1.718..  Test Accuracy: 0.500
Epoch: 166/500..  Training Loss: 0.399..  Train Accuracy: 0.794.. Test Loss: 1.622..  Test Accuracy: 0.452
Epoch: 167/500..  Training Loss: 0.379..  Train Accuracy: 0.817.. Test Loss: 1.402..  Test Accuracy: 0.440
Epoch: 168/500..  Training Loss: 0.327..  Train Accuracy: 0.858.. Test Loss: 1.430..  Test Accuracy: 0.464
Epoch: 169/500..  Training Loss: 0.317..  Train Accuracy: 0.852.. Test Loss: 1.649..  Test Accuracy: 0.429
Epoch: 170/500..  Training Loss: 0.328..  Train Accuracy: 0.820.. Test Loss: 1.729..  Test Accuracy: 0.452
Epoch: 171/500..  Training Loss: 0.351..  Train Accuracy: 0.843.. Test Loss: 1.593..  Test Accuracy: 0.500
Epoch: 172/500..  Training Loss: 0.381..  Train Accuracy: 0.830.. Test Loss: 1.334..  Test Accuracy: 0.548
Epoch: 173/500..  Training Loss: 0.386..  Train Accuracy: 0.777.. Test Loss: 1.456..  Test Accuracy: 0.440
Epoch: 174/500..  Training Loss: 0.410..  Train Accuracy: 0.785.. Test Loss: 1.398..  Test Accuracy: 0.440
Epoch: 175/500..  Training Loss: 0.326..  Train Accuracy: 0.851.. Test Loss: 1.562..  Test Accuracy: 0.548
Epoch: 176/500..  Training Loss: 0.327..  Train Accuracy: 0.852.. Test Loss: 1.620..  Test Accuracy: 0.440
Epoch: 177/500..  Training Loss: 0.319..  Train Accuracy: 0.862.. Test Loss: 1.866..  Test Accuracy: 0.548
Epoch: 178/500..  Training Loss: 0.257..  Train Accuracy: 0.890.. Test Loss: 1.996..  Test Accuracy: 0.440
Epoch: 179/500..  Training Loss: 0.251..  Train Accuracy: 0.911.. Test Loss: 1.795..  Test Accuracy: 0.536
Epoch: 180/500..  Training Loss: 0.291..  Train Accuracy: 0.837.. Test Loss: 1.713..  Test Accuracy: 0.500
Epoch: 181/500..  Training Loss: 0.260..  Train Accuracy: 0.877.. Test Loss: 1.877..  Test Accuracy: 0.440
Epoch: 182/500..  Training Loss: 0.253..  Train Accuracy: 0.907.. Test Loss: 2.102..  Test Accuracy: 0.393
Epoch: 183/500..  Training Loss: 0.238..  Train Accuracy: 0.908.. Test Loss: 2.051..  Test Accuracy: 0.476
Epoch: 184/500..  Training Loss: 0.220..  Train Accuracy: 0.890.. Test Loss: 2.047..  Test Accuracy: 0.464
Epoch: 185/500..  Training Loss: 0.185..  Train Accuracy: 0.929.. Test Loss: 2.219..  Test Accuracy: 0.381
Epoch: 186/500..  Training Loss: 0.247..  Train Accuracy: 0.839.. Test Loss: 2.216..  Test Accuracy: 0.524
Epoch: 187/500..  Training Loss: 0.190..  Train Accuracy: 0.935.. Test Loss: 2.470..  Test Accuracy: 0.548
Epoch: 188/500..  Training Loss: 0.196..  Train Accuracy: 0.931.. Test Loss: 2.896..  Test Accuracy: 0.440
Epoch: 189/500..  Training Loss: 0.250..  Train Accuracy: 0.892.. Test Loss: 2.368..  Test Accuracy: 0.512
Epoch: 190/500..  Training Loss: 0.223..  Train Accuracy: 0.882.. Test Loss: 2.351..  Test Accuracy: 0.405
Epoch: 191/500..  Training Loss: 0.251..  Train Accuracy: 0.894.. Test Loss: 2.406..  Test Accuracy: 0.405
Epoch: 192/500..  Training Loss: 0.258..  Train Accuracy: 0.914.. Test Loss: 2.470..  Test Accuracy: 0.429
Epoch: 193/500..  Training Loss: 0.211..  Train Accuracy: 0.924.. Test Loss: 2.192..  Test Accuracy: 0.452
Epoch: 194/500..  Training Loss: 0.248..  Train Accuracy: 0.906.. Test Loss: 2.368..  Test Accuracy: 0.440
Epoch: 195/500..  Training Loss: 0.226..  Train Accuracy: 0.926.. Test Loss: 2.552..  Test Accuracy: 0.357
Epoch: 196/500..  Training Loss: 0.223..  Train Accuracy: 0.904.. Test Loss: 2.476..  Test Accuracy: 0.429
Epoch: 197/500..  Training Loss: 0.188..  Train Accuracy: 0.930.. Test Loss: 2.441..  Test Accuracy: 0.440
Epoch: 198/500..  Training Loss: 0.192..  Train Accuracy: 0.904.. Test Loss: 2.796..  Test Accuracy: 0.405
Epoch: 199/500..  Training Loss: 0.184..  Train Accuracy: 0.932.. Test Loss: 2.115..  Test Accuracy: 0.488
Epoch: 200/500..  Training Loss: 0.148..  Train Accuracy: 0.962.. Test Loss: 2.419..  Test Accuracy: 0.464

This is with 200 samples, but now im not too sure how I can stop this from instantly overfitting. At least there is some small progress so far.