Help debugging code

hi, i’m pretty new on using pytorch for training a model and i’m trying to debug this code, 'cause the accuracy seems pretty strange (some run 0.02 and others 1.0). can someone help me spotting some bug in the code?
the following is the code for training

num_epochs = 150
batch_size = 1024
batch_start = torch.arange(0, len(X_train), batch_size)
precision_train_ = []
accuracy_train_ = []
recall_train_ = []
f1_train_ = []
loss_train_ = []

precision_val_ = []
accuracy_val_ = []
recall_val_ = []
f1_val_ = []
loss_val_ = []

print("Starting training...")

for Epoch in range(num_epochs):
    for start in batch_start:
        X_train_batch = X_train[start:start+batch_size]
        y_train_batch = y_train[start:start+batch_size]

        # Forward pass
        outputs = model(X_train_batch)
        loss = criterion(outputs, y_train_batch.view(-1, 1))

        # Backpropagation and optimization
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        with torch.no_grad():
            model.eval()

            # Predictions on training set
            predictions_train = model(X_train_batch)
            loss_train = criterion(predictions_train, y_train_batch.view(-1, 1))
            predictions_train = (predictions_train > 0.5).float()

            # Metrics on training set
            accuracy_train = torch.sum(predictions_train == y_train_batch.view(-1, 1)).item() / y_train.shape[0]
            accuracy_train_.append(accuracy_train)
            precision_train = precision_score(y_train_batch, predictions_train, average='weighted')
            precision_train_.append(precision_train)
            recall_train = recall_score(y_train_batch, predictions_train, average='weighted')
            recall_train_.append(recall_train)
            f1_train = f1_score(y_train_batch, predictions_train, average='weighted')
            f1_train_.append(f1_train)
            loss_train_.append(loss_train)

            # Predictions on validation set
            predictions_val = model(X_test)
            loss_val = criterion(predictions_val, y_test.view(-1, 1))
            predictions_val = (predictions_val > 0.5).float()

            # Metrics on validation set
            accuracy_val = torch.sum(predictions_val == y_test.view(-1, 1)).item() / y_test.shape[0]
            accuracy_val_.append(accuracy_val)
            precision_val = precision_score(y_test, predictions_val, average='weighted')
            precision_val_.append(precision_val)
            recall_val = recall_score(y_test, predictions_val, average='weighted')
            recall_val_.append(recall_val)
            f1_val = f1_score(y_test, predictions_val, average='weighted')
            f1_val_.append(f1_val)
            loss_val_.append(loss_val)

    before_lr = optimizer.param_groups[0]["lr"]
    #scheduler.step()
    after_lr = optimizer.param_groups[0]["lr"]

    if (Epoch + 1) % 10 == 0:
        print(f"Epoch [{Epoch+1}/{num_epochs}] | Loss_train: {loss_train.item():.4f} | Loss_val: {loss_val.item():.4f} | Adam lr {before_lr:.4f} -> {after_lr:.4f} | Accuracy_train: {accuracy_train:.2f} | Accuracy_val: {accuracy_val:.2f}")

Could you also post the code of your model and the other remaining parts (e.g., which loss function you are using). A lot of things can be off outside the training loop.

yeah sure. just wanted to keep post smaller :slight_smile:
the following is my model

class BinaryClassifier(nn.Module):
        def __init__(self, input_size, h1, h2, output_size, weight_init='xavier'):
            super(BinaryClassifier, self).__init__()
            
            self.l1 = nn.Linear(input_size,h1)
            self.tanh1 = nn.ReLU()
            self.dropout2 = nn.Dropout(0.2)
            self.l2 = nn.Linear(h1,h2)
            self.tanh2 = nn.ReLU()
            self.dropout3 = nn.Dropout(0.2)
            self.l3 = nn.Linear(h2,output_size)
            
            # Inizializzazione dei pesi
            if weight_init == 'xavier':
                nn.init.xavier_uniform_(self.l1.weight)
                nn.init.xavier_uniform_(self.l2.weight)
                nn.init.xavier_uniform_(self.l3.weight)
            elif weight_init == 'he':
                nn.init.kaiming_uniform_(self.l1.weight, nonlinearity='relu')
                nn.init.kaiming_uniform_(self.l2.weight, nonlinearity='relu')
                nn.init.kaiming_uniform_(self.l3.weight, nonlinearity='relu')
            else:
                raise ValueError("Invalid weight initialization option. Supported options: 'xavier', 'he'.")

        def forward(self,x):
            x = self.l1(x)
            x = self.tanh1(x)
            x = self.dropout2(x)
            x = self.l2(x)
            x = self.tanh2(x)
            x = self.dropout3(x)
            x = self.l3(x)
            return torch.sigmoid(x)

this is how i initialize it

input_size = X_train.shape[1]
model = BinaryClassifier(input_size, 1024, 64, 1)

criterion = nn.BCELoss()  # Binary Cross-Entropy Loss
optimizer = torch.optim.SGD(model.parameters(), lr=0.001)#, weight_decay=0.01)
#scheduler = lr_scheduler.ExponentialLR(optimizer, gamma=0.99)
#scheduler = lr_scheduler.LinearLR(optimizer, start_factor=1.0, end_factor=0.1, total_iters=30)

and this is how i load the dataset that is composed by 20000 sample. each sample is an array of 65536 elements each

X_train = torch.load('path')
X_test = torch.load('path')
y_train = torch.load('path')
y_test = torch.load('path')
X_train = torch.FloatTensor(X_train)
y_train = torch.FloatTensor(y_train)
X_test = torch.FloatTensor(X_test)
y_test = torch.FloatTensor(y_test)

Hm, I can’t see anything obvious wrong. Maybe you can try the Adam optimizer instead of SGD.

already tried. the code seems ok btw and now the accuracy seems stable also, but i cannot reach a value of more than 70/71%.

Are you talking about training or test accuracy? Depending on your task 71% is not bad, even for a binary tasks (e.g., sentiment analysis).

Can you overfit your model, i.e., achieve a training accuracy of 100% using a very small sample of your training data?

actually using some setup after a bunch of epochs i can achieve 100% of accuracy on training set, while accurary on test set remains around 70 (maybe a bit more). so yes, i can actually. what i’m trying to do is to classify a file, trying to distinguish if it’s encrypted or not. in the previous code i’m using the whole dataset to train the model instead of using batch training and in my last experiments i’ve trying it also, but the the results are pretty sad