I am working with a set of data for training a deep learning LSTM model in PyTorch. I have written a working model with a single variable as input but I was wondering what the convention was for a multi-dimensional input tensor. I already know the LSTM module in PyTorch accepts data of the form (batch_size, sequence_length, input_size)
however I’d like to use training data of the form
Date | x1 | x2 | x3 |
---|---|---|---|
date1 | data11 | data12 | data13 |
date2 | data21 | data 22 | data 23 |
etc | etc | etc | etc |
I am using a moving window method to get sequences and they are all stored in a large tensor of the form (5,36070,10,1)
. So 5 lists (corresponding to the number of variables) of 36070 elements each with length of 10 elements. Due to the nature of the input of the LSTM module my initial approach was to reshape the large tensor to (180350,10,1)
, so the 5 lists of 306070 have been stacked. Is this the correct approach? Or can the input of the LSTM module accept tensor shapes instead of a scalar representing the size of a unidimensional vector?