Many-to-One LSTM Input Shape

“One-to-many sequence problems are sequence problems where the input data has one time-step, and the output contains a vector of multiple values or multiple time-steps.”

I am trying to make a One-to-many LSTM based model in pytorch.

It is a binary classification problem there is only 2 classes. However, the labels should be a vector of 2 classes so for example:

LABEL VECTOR [array([0., 1.]), array([0., 1.]), array([0., 1.]), array([0., 1.]), array([0., 1.]), array([0., 1.]), array([0., 1.]), array([0., 1.]), array([0., 1.]), array([0., 1.]), array([1., 0.]), array([1., 0.]), array([1., 0.]), array([1., 0.]), array([1., 0.]), array([0., 1.])]

num_classes = 2

from torch import nn

#define Model
class LSTMClassifier(nn.Module):

    def __init__(self, input_size, lstm1_hidden_size,num_layers, num_classes):
        super(LSTMClassifier, self).__init__()

        #shape = (1,8192,16)
        self.lstm1 = nn.LSTM(input_size=input_size, hidden_size=lstm1_hidden_size, num_layers=num_layers, batch_first=True)
     
        self.classifier = nn.Linear(lstm1_hidden_size, num_classes)

        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        lstm_out, _ = self.lstm1(x) #hidden state & cell state returned 

        pred = self.classifier(lstm_out)

        pred = self.sigmoid(pred)


        return pred

This model outputs a target shape of (1,num_segments,2)

The shape of the data is:
(1,num_segments,8192)

The shape of the labels is:
(1,num_segments,16,2)

Again the labels look like the following:

  • there are always fixed 16 labels, each having binary classification with 2 columns. Each column may be a 0 or 1. Thus 2 classes. So I want the output of the LSTM model to be a sequence of binary classifications.
LABEL VECTOR [array([0., 1.]), array([0., 1.]), array([0., 1.]), array([0., 1.]), array([0., 1.]), array([0., 1.]), array([0., 1.]), array([0., 1.]), array([0., 1.]), array([0., 1.]), array([1., 0.]), array([1., 0.]), array([1., 0.]), array([1., 0.]), array([1., 0.]), array([0., 1.])]

Right now the error I am getting is:

ValueError: Target size (torch.Size([1, 7, 16, 2])) must be the same as input size (torch.Size([1, 7, 2]))

How can I structure this LSTM pytorch model to get an output as a vector of Binary Classification labels?

The output of LSTM layer is output, (h_n, c_n) (see LSTM — PyTorch 1.10.1 documentation)

The hidden state of the last layer is usually utilized to do the binary classification.
To do so,

#define Model
class LSTMClassifier(nn.Module):
    def __init__(self, input_size, lstm1_hidden_size,num_layers, num_classes):
        super(LSTMClassifier, self).__init__()
        #shape = (1,8192,16)
        self.lstm1 = nn.LSTM(input_size=input_size, hidden_size=lstm1_hidden_size, num_layers=num_layers, batch_first=True)
        self.classifier = nn.Linear(lstm1_hidden_size, num_classes)
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        last_h, _ = self.lstm1(x)[1] #hidden state is returned
        # (tensor of shape (D * num_layers, N, H_out))
        last_h = last_h.permute(1, 0, 2).flatten(start_dim=1)
        pred = self.classifier(last_h)
        pred = self.sigmoid(pred)
        return pred

The above is an example code to help your understand.