Problem with converting my LSTM multi-class classification model to a binary classification model

bbasaran · May 2, 2021, 8:24am

Hey @ptrblck , thank you for your answer! Actually, the number of classes is not 256, but it is 2: either 0 or 1. In this current version (multi-class classification), I call the “Classifier” in this way:

model = Classifier(n_features = numFeatures, n_classes = 2)

and as you can see on the init method of class “moduleLSTM”, last layer is

self.classifier = nn.Linear(n_hidden, n_classes) # where n_classes = 2

and in the init method of “Classifier”:

self.model = ModuleLSTM(n_features, n_classes)
self.criterion = nn.CrossEntropyLoss()

As you see, this is a configuration for multi-class classification. Since I have only 2 classes (0/1), I’d like to modify this model for binary classification. Therefore, I am trying to migrate from CrossEntropyLoss to BCELoss and from Linear Activation to Sigmoid. The problem is that according to the PyTorch LSTM Documentation, the shape of return value for “hidden”:

_, (hidden, _) = self.lstm(x)

is: (num_layers * num_directions, batch, hidden_size). Since I use n_hidden=256, my output becomes [64, 256] and I get a size mismatch error. This is why I wonder that maybe I have to change also the forward method of my “ModuleLSTM” from scratch to adapt my model to binary classification.