For a overfitting lstm model, i tried to add droputs. The accuracy seems to go down from the baseline model. I tried validation across 5 train_test set.

Do you mean the training accuracy (resubstitution) or the validation accuracy?

It’s common to see the training accuracy dropping a bit, since the model is “smaller”, i.e. is should have less capacity. The validation accuracy however shouldn’t go down.

Have you checked to use `model.train()`

and `model.eval()`

for training and evaluation respectively?

Hi, Thanks for your reply and yes. I respectively use model.train() and model.eval(). The validation accuracy is going down.

**This is my model**.

I chose a simple model with cross entrophy loss and adam optimiser. But the test accuracy seems to be really poor (69%) compared with the same model in svm. Training accuracy is 93%

I tried various regularization parameter(dropout, weightnorm, l2). Nothing seems to address the overfitting problem.

Is there a problem with my model? Or the way I evaluate the model? Or loss prediction? I am really at a loss

class LSTM(nn.Module):

def **init**(self, embedding_dim, hidden_dim, vocab_size, label_size):

super(LSTM, self).**init**()

```
self.hidden_dim = hidden_dim
self.label_size = label_size
update_dim = hidden_dim
self.embeddings = nn.Embedding(vocab_size, embedding_dim)
self.lstm = nn.LSTM(input_size=embedding_dim, hidden_size=hidden_dim)
#self.lstm = nn.utils.weight_norm(self.lstm, name = 'weight_hh_l0')
#self.lstm = nn.utils.weight_norm(self.lstm, name = 'weight_ih_l0')
#self.lstm.flatten_parameters()
self.fullyconnected = nn.Linear(update_dim, 100)
self.relu = nn.ReLU()
#self.dropout = nn.Dropout(0.3)
self.last = nn.Linear(100,label_size)
self.hidden= self.init_hidden()
```

def forward(self, sentence, aspect_term): #sent separaetly

##input standadisation

sentence = self.embeddings(sentence).view(len(sentence), 1, -1)

lstm_out, self.hidden = self.lstm(sentence, self.hidden)#updating hidden and cell states

embedding_vec = lstm_out[-1]

fc = (self.relu(self.fullyconnected(embedding_vec)))

y = self.last(fc)

probs = F.softmax(y)

return probs

def init_hidden(self):

return (my_variable(torch.zeros(1, 1, self.hidden_dim)),

my_variable(torch.zeros(1, 1, self.hidden_dim)))

#loss update

loss = loss_function([0.2,0.2,0.6], actual_label = 2)

#Model evaluation#

def getpred(model, loss_function, x_pred, y_pred):

counter = 0;total_loss = 0.0

XTEST = x_pred;

YTEST = y_pred

model.eval()

for i, x_test in enumerate(XTEST):

sentence = my_variable(LongTensor([int(n) for n in x_test[0]]))

aspect = my_variable(LongTensor([int(n) for n in x_test[1]]))

```
model.hidden = model.init_hidden()
probs = model(sentence)
# Compute loss
true_label = my_variable(LongTensor([int(YTEST[i])]), requires_grad=False)
loss = loss_function(probs, true_label)
total_loss += float(loss.data[0])
# Get prediction for max prob
max_value, idx = torch.max(probs, 1)
if USE_CUDA:
Y_pred = idx.data.cpu().numpy()
Y_target = YTEST[i]
else:
Y_pred = idx.data.numpy()
Y_target = YTEST[i]
if Y_pred == Y_target:
counter += 1
```

print(‘Loss – {}’.format((float(total_loss))))

print(‘Accuracy – {}’.format((counter/len(XTEST)) * 100))