I am trying to train a bi-directional LSTM. My input data is in shape [batch_size, 8(seq), 6], my output is in shape- [batch_size, 2] My code:
class MODEL(nn.Module):
def __init__(self, input_size, hidden_size, num_layers, num_classes, device):
super(MODEL, self).__init__()
torch.manual_seed(0)
self.device = device
self.input_size = input_size
self.hidden_size = hidden_size
self.num_layers = num_layers
self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True, bidirectional=True).to(self.device)
#self.fc = nn.Linear(hidden_size*2, num_classes).to(self.device)
self.dropout = nn.Dropout(0.2)
self.activation = nn.Sigmoid()
def forward(self, x):
h0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(self.device)
c0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(self.device)
out, _ = self.lstm(x, (h0, c0))
out = out[:,-1,:]
out = self.activation(self.fc1(out))
return out
I am using Adam optimiser with 0.0015 LR and SmoothL1Loss. For more reference, I am trying to implement this paper - IONet - except their sequence length for data is 200.
I have also tried using MSELoss for the purpose, but still the loss keeps fluctuating and does not decrease. I also referred to a previous raise issue, which sounds similar to mine(LSTM loss keeps fluctuating) , but it seems hasn’t been solved yet.
My loss function keeps fluctuating it initially decreases, but then it starts to increase. Any help is appreciated. Thanks.