Question about Neural Network outputs for a Binary Classification problem

ntquanghai · May 11, 2023, 4:35pm

So I currently have this graph dataset, which have two classes, and I intend to do a binary classificaiton problem. The following is the data’s information

from torch_geometric.datasets import EllipticBitcoinDataset

bitcoin = EllipticBitcoinDataset(root = 'data/EllipticBitcoinDataset', transform=NormalizeFeatures())
bitcoinData = bitcoin[0]
print(bitcoinData)
###
Data(x=[203769, 165], edge_index=[2, 234355], y=[203769], train_mask=[203769], test_mask=[203769])
###


class GCN(torch.nn.Module):
    def __init__(self, in_channels, hidden_channels = 64):
        super().__init__()
        self.conv1 = GCNConv(in_channels, hidden_channels)
        self.conv2 = GCNConv(hidden_channels, 2)
        self.activation = nn.ReLU()
        self.outputs = nn.Linear(2, 1)

    def forward(self, x, edge_index):
        x = self.conv1(x, edge_index)
        x = self.activation(x)
        x = self.conv2(x, edge_index)
        x = self.outputs(x)
        return torch.sigmoid(x)

model= GCN(in_channels=bitcoinData.x.size(dim=1))
criterion = torch.nn.BCELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.005)
print(bitcoinData)
for i in range(201):
    model.train()
    optimizer.zero_grad()
    output = model(bitcoinData.x, bitcoinData.edge_index)    
    loss = criterion(output[bitcoinData.train_mask], bitcoinData.y.unsqueeze(1)[bitcoinData.train_mask].to(torch.float))
    # loss = criterion(output, bitcoin[0].y[data.train_mask])
    if i % 20 == 0:
        print(f'Epoch: {i}, Loss: {loss.item()}')
    loss.backward()
    optimizer.step()
###
Epoch: 0, Loss: 0.909164309501648 Epoch: 20, Loss: 0.5957821011543274 Epoch: 40, Loss: 0.3599470853805542 Epoch: 60, Loss: 0.36201682686805725 Epoch: 80, Loss: 0.36009129881858826 Epoch: 100, Loss: 0.3595191538333893 Epoch: 120, Loss: 0.35936957597732544 Epoch: 140, Loss: 0.35924962162971497 Epoch: 160, Loss: 0.35912320017814636 Epoch: 180, Loss: 0.3589892089366913 Epoch: 200, Loss: 0.35884779691696167
###

model.eval()

out = model(bitcoinData.x, bitcoinData.edge_index)

print(out)
###
tensor([[0.1474],
        [0.1474],
        [0.1474],
        ...,
        [0.1478],
        [0.1362],
        [0.1476]], grad_fn=<SigmoidBackward0>)
###

I have tried applying log_softmax and argmax to the ouput, but what I get in return is a tensor full of zeros. I wanted to ask whether my approach was correct, and what else could I do to get the correctly represent the predicted labels. I have read some discussions and they suggest to reconfigure the threshold but is there a programatic or conventional approach to that, or do I need to manually figure that out?

I do not think there is a problem with the dataset, since this is an imported default dataset from the Pytorch Geometric library, so I believe I have made mistakes on my solution. I am quite amateurish when it comes to the topic, so I do apologize if my question seems too simple. I would love to hear your thoughts and comments, and I truly appreciate any help and response. Thank you all!

ptrblck · May 15, 2023, 5:07am

Applying torch.argmax on a single value is wrong as it can only return zeros.
Since your binary classification use case outputs a single probability for each sample you can apply a probability threshold (e.g. 0.5) to get the predicted class:

preds = output > 0.5

This threshold can also be tuned using the ROC curve.

I would also suggest to remove the torch.sigmoid and replace nn.BCELoss with nn.BCEWithLogitsLoss for a better numerical stability.
The predictions can still be created by applying the sigmoid and threshold on the logit outputs.