A question about the input of nn.lstm

Hi, a simple question (may be silly)

The input of nn.lstm need to be 3d tensors(timestep,batch,features),but ,for example, i have a single time series of length 10 and features dimension is 4, and i want the batch size is 4,so how do i deal with the timestep,if i do this

class x_lstm(nn.Module):
    def __init__(self):
       super(x_lstm, self).__init__()
       self.lstm = nn.LSTM(
          input_size=4,
          hidden_size=4,
          num_layers=1
      )

     def forward(self, x, h_state):
       output, h_state = self.lstm(x, h_state)
       return output, h_state

x_lstm = x_lstm()

# a single feature should be 1x4, but the batch size set to 4, so the feature below is 4x4 tensor from dataloader which have 10 data and batchsize is 4
dataloader = torch.utils.data.Dataloader(DataSet,batch_size=4)
for feature in enumerate(dataloader):
    features = features.view(-1,4,4)
    output, h_state = x_lstm(features,h_state)

and i got this

RuntimeError: invalid argument 2: size '[-1 x 4 x 4]' is invalid for input of with 8 elements at /pytorch/torch/lib/TH      /THStorage.c:37

Beacause there is 2 data left cann’t be a batch which need 4 data

So, what i supports to do ? Thanks to anyone apply in advance

You can drop the last 2 data:

dataloader = torch.utils.data.Dataloader(DataSet, batch_size=4, drop_last=True)
1 Like

Thank you !
By the way, Do you know some way can use the last 2 data?

You can still copy and past 2 of your data…
But if you randomly pick your data with the loader and do several epochs, it’s quite OK to let the 2 random resting data.

Hello,

I’m writing here in order to not open another thread. I understand why my data should have this input size in a LSTM but I’m dealing with a problem when sometimes I leave out an amount of data equal to 70% of my batch size, and I think this situation should be avoided. Do you agree with that?

And if the answer is yes, do you think copying some random samples until get a full batch size is the best way to go?