I am playing with BCEWithLogitsLoss to try to put in place a classifier. My problem does not require an ANN to be solved but I am curious to just make it work.
Please find pseudo-code below for the model :
class BinaryClassifier(nn.Module): def __init__(self, input_features, hidden_dim, output_dim): super(BinaryClassifier, self).__init__() self.fc1 = nn.Linear(input_features, hidden_dim) self.fc2 = nn.Linear(hidden_dim, hidden_dim) self.fc3 = nn.Linear(hidden_dim, output_dim) def forward(self, x): x = F.relu(self.fc1(x)) x = F.dropout(F.relu(self.fc2(x))) x = self.fc3(x) return torch.squeeze(x)
And here is the training loop
train_loader = get_train_data_loader(...) model = BinaryClassifier(args.input_features, args.hidden_dim, args.output_dim) optimizer = optim.Adam(model.parameters(), lr = 0.1) criterion = torch.nn.BCEWithLogitsLoss() for epoch in range(1, epochs + 1): model.train() total_loss = 0 for batch in train_loader: # get data batch_x, batch_y = batch batch_x = batch_x.to(device) batch_y = batch_y.to(device) optimizer.zero_grad() # get predictions from model y_pred = model(batch_x) scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 100, gamma=0.1, last_epoch=-1) # perform backprop loss = criterion(y_pred, batch_y) loss.backward() optimizer.step() scheduler.step() total_loss += loss.data.item()
The training data is tabular data : 3 features, around 100 samples.
The target is the class : 0 or 1
After 1000 epochs (too much in my opinion), the loss is staying quite high and the output of the model remains weird in my opinion (negative values, very large values, …)
I sense I am doing something wrong with how I am using the loss and the configuration of the last layer. BCEWithLogits expects as the name implies logits. This means raw output from the neural network right? Or am I missing something ?
Could someone point me toward the right direction ?