Multivariate forecasting using LSTM Network

Hello,

I’m currently working on a multivariate forecasting using RNN with LSTM layers. The data I use consists of a lot of small samples with 21 input features that change over a small time span. The result I’m looking for is a network that can predict what each of these 21 inputs will result in, which leads me to use RNN.

The network is currently very simple:

in_features = 21
hidden_dim = 200
out_features = 21
n_layers = 1

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()

        self.lstm = nn.LSTM(in_features, hidden_dim, num_layers = n_layers)
        self.l_out = nn.Linear(in_features = hidden_dim, out_features = out_features)
        
    def forward(self, x):
        x, (h, c) = self.lstm(x)
        x = x.view(-1, self.lstm.hidden_size)
        x = self.l_out(x)
        x = x.reshape(self.l_out.out_features, -1)
        return x

net = Net()
print(net)

The shape of the input changes line-by-line like this: [5, 1, 21] -> [5, 1, 200] -> [5, 200] -> [5, 21] -> [21, 5]

The training step is however where the problem is:

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.01)

training_loss, validation_loss = [], []
num_epochs = 5

for i in range(num_epochs):

    epoch_training_loss = 0
    epoch_validation_loss = 0
    
    net.eval()
    
    for file in validation:
        validation_set = np.genfromtxt('trainingsets/' + file, delimiter=',', skip_header=1)
        validation_set = np.nan_to_num(validation_set)

        target = validation_set[-1,0:21]
        input = validation_set[:-1,0:21]

        input = torch.Tensor(input)
        input = input.reshape(input.size(0), 1, 21)
        
        target = torch.LongTensor(target)

        outputs = net(input)
        
        loss = criterion(outputs, target)
        
        epoch_validation_loss += loss.detach().numpy()
    
    net.train()
    
    for file in training:
        training_set = np.genfromtxt('trainingsets/' + file, delimiter=',')
        training_set = np.nan_to_num(training_set)
        
        target = training_set[-1,0:21]
        input = training_set[:-1,0:21]

        input = torch.Tensor(input)
        input = input.reshape(input.size(0), 1, 21)
        target = torch.LongTensor(target)
        
        outputs = net(input)
        loss = criterion(outputs, target)
        
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        epoch_training_loss += loss.detach().numpy()

I know that loading my data from individual files is slow and I’ll fix that later. The solution I’ve come up with is using all the time steps up until the last one, which I use as the target. However with this solution the loss function: loss = criterion(outputs, target) using cross entropy loss uses the target variable as an index array, which is where I’m confused as I’m giving it the expected last input row to check and I can’t really seem to figure out what to do instead.

An example of the input and target tensors are as follows:

tensor([[[-0.5615,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.3421,
           0.4706,  0.0000,  0.0000,  0.0000,  0.0000,  0.2183,  0.0000,
           0.0000, -0.1594,  0.0000,  0.0000,  0.0000,  0.0000,  0.2183]],

        [[ 0.5615,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.3421,
           0.4706,  0.0000,  0.0000,  0.1450,  0.0000,  0.2892,  0.5000,
           0.0000, -0.3223,  0.2806,  0.0000,  0.0000,  0.0000,  0.2892]],

        [[ 0.6079,  0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.3704,
           0.5094,  0.0000,  0.0000,  0.2859,  0.2734,  0.4326,  0.5000,
           0.0000,  0.0372,  0.3718,  0.0000,  0.0000,  0.0000,  0.4326]],

        [[ 0.0000,  0.0000,  0.0000,  0.0000,  0.0000,  0.7289,  0.6256,
           0.5455,  0.0000,  0.0000,  0.4225,  0.5390,  0.5354,  0.5000,
           0.0000,  0.4953,  0.5561,  0.0000,  0.0000,  0.0000,  0.5354]]])
tensor([0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.6846, 0.4871, 0.0037, 0.0000,
        0.0000, 0.8478, 0.7967, 0.6285, 0.5000, 0.0000, 0.7899, 0.6883, 0.0000,
        0.0000, 0.0000, 0.6285])

Hope you can help me and clarify what I’m doing wrong. Thanks!