Sentiment analysis using LSTM on imbalanced citation dataset

I have an extremely unbalanced dataset. https://cl.awaisathar.com/citation-sentiment-corpus/
Class POSITIVE:829
Class NEGATIVE:280
Class NEUTRAL: 7627

Here is my network:

import torch.nn as nn

class Sentiment_LSTM(nn.Module):
    """
    We are training the embedded layers along with LSTM for the sentiment analysis
    """

    def __init__(self, vocab_size, output_size, embedding_dim, hidden_dim, n_layers, drop_prob=0.5):
        """
        Settin up the parameters.
        """
        super(Sentiment_LSTM, self).__init__()

        self.output_size = output_size
        self.n_layers = n_layers
        self.hidden_dim = hidden_dim
        
        # embedding layer and LSTM layers 
        self.embedding = nn.Embedding(vocab_size, embedding_dim)
        self.lstm = nn.LSTM(embedding_dim, hidden_dim, n_layers, 
                            dropout=drop_prob, batch_first=True)
        
        # dropout layer to avoida over fitting
        self.dropout = nn.Dropout(0.5)
        
        # linear and sigmoid layers
        self.fc = nn.Linear(hidden_dim, output_size)
        self.sig = nn.Sigmoid()
        

    def forward(self, x):
        """
        Perform a forward pass

        """
        batch_size = x.size(0)

        x = x.long()
        embeds = self.embedding(x)

        lstm_out, hidden = self.lstm(embeds)

    
        # stack up lstm outputs
        lstm_out = lstm_out.contiguous().view(-1, self.hidden_dim)


        out = self.dropout(lstm_out)
        out = self.fc(out)

        # sigmoid function
        sig_out = self.sig(out)
        
        # reshape to be batch_size first
        sig_out = sig_out.view(batch_size, -1,3)
        #print("sig_out",sig_out.shape)
        sig_out = sig_out[:, -1,:] # get last batch of labels
        
        # return last sigmoid output and hidden state
        return sig_out

Loss function:

lr=0.001

criterion = nn.BCELoss()
optimizer = torch.optim.Adam(net.parameters(), lr=lr)

My accuracy is low on the small classes. How can i improve it futher?

Hi!

Are you trying to predict all three classes? Positive, neutral, negative?

If so, applying a sigmoid function probably isn’t the way to as that’s designed for Binary cases. Using a Softmax function, with NLLLoss is better - or you can pass the raw logits (from the linear layer) to CrossEntropyLoss which combines the softmax + NLLLoss.

Hello , Thanks for the input.
I modified the network as below.

import torch.nn as nn

class Sentiment_LSTM(nn.Module):
    """
    We are training the embedded layers along with LSTM for the sentiment analysis
    """

    def __init__(self, vocab_size, output_size, embedding_dim, hidden_dim, n_layers, drop_prob=0.5):
        """
        Settin up the parameters.
        """
        super(Sentiment_LSTM, self).__init__()

        self.output_size = output_size
        self.n_layers = n_layers
        self.hidden_dim = hidden_dim
        
        # embedding layer and LSTM layers 
        self.embedding = nn.Embedding(vocab_size, embedding_dim)
        self.lstm = nn.LSTM(embedding_dim, hidden_dim, n_layers, 
                            dropout=drop_prob, batch_first=True)
        
        # dropout layer to avoida over fitting
        self.dropout = nn.Dropout(0.5)
        
        # linear and sigmoid layers
        self.fc = nn.Linear(hidden_dim, output_size)
        

    def forward(self, x):
        """
        Perform a forward pass

        """
        batch_size = x.size(0)

        x = x.long()
        embeds = self.embedding(x)

        lstm_out, hidden = self.lstm(embeds)

    
        # stack up lstm outputs
        lstm_out = lstm_out.contiguous().view(-1, self.hidden_dim)


        out = self.dropout(lstm_out)
        out = self.fc(out)


        
        # reshape to be batch_size first
        out = out.view(batch_size, -1,3)
        #print("sig_out",sig_out.shape)
        out = out[:, -1,:] # get last batch of labels
        
        # return last sigmoid output and hidden state
        return out

Loss as follows:

# loss and optimization functions
lr=0.001

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters(), lr=lr)

Is my understanding correct?

PyTorch has a tutorial for text classification analysis here. Consider to replace Bag-of-Word model with LSTM for your case. Other parts should be same, including CrossEntropyLoss.

1 Like