RNN many to one

WaleedEsmail · September 4, 2019, 8:10am

Dear PyTorch experts,

I am trying to understand the RNN and how to implement it as a classifier (Many to one). I’ve read many tutorials but still confused. One of these tutorials suggest to use the following:

# Recurrent neural network (many-to-one)
class RNN(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, num_classes):
        super(RNN, self).__init__()
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_size, num_classes)
    
    def forward(self, x):
        # Set initial hidden and cell states 
        h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(device) 
        c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(device)
        
        # Forward propagate LSTM
        out, _ = self.lstm(x, (h0, c0))  # out: tensor of shape (batch_size, seq_length, hidden_size)
        
        # Decode the hidden state of the last time step
        out = self.fc(out[:, -1, :])
        return out

What I am confused about is why to use the last time step as an input for the dense layer. Also for a binary classification problem for example a sigmoid activation is applied on the output, why is it not applied here?! ( also for all tutorials that I read).

Thank you very much for your help

** tutorial from here

Mahbub_Zaman · March 14, 2021, 4:17pm

In the batch first mode (which is what you have), the output has this meaning:
batch * seq_len * vector size
So with a -1 in the middle dimension, you are extracting the very last output of time series output sequence of the RNN. This final output represents the summary of the entire sequence and that is what you want for a many to one scenario. The other :'s just means you want that last output for all batch and all input vectors.