DataLoader for a LSTM Model with a Sliding Window

I am working on a LSTM model and trying to use a DataLoader to provide the data. I am using stock price data and my dataset consists of:

  • Date (string)
  • Closing Price (float)
  • Price Change (float)

Right now I am just looking for a good example of LSTM using similar data so I can configure my DataSet and DataLoader correctly.

To test my DataLoader I have the following code:

    for i, d in enumerate(dataloader):
        print(i, d)

Using the following definition of dataloader the test works.

    dataloader = DataLoader(pricedata,
                            batch_size=30,
                            shuffle=False,
                            num_workers=4)

This give me x batches of size 30, which makes sense.

However, I need to use a sliding window of size n so, assuming there are k instances in the dataset, I would like k-n batches with n instance in each batch.

So I redefined dataloader as:

    dataloader = DataLoader(pricedata,
                            batch_sampler=torch.utils.data.sampler.SequentialSampler(pricedata),
                            shuffle=False,
                            num_workers=4)

With this change I get the following error message:
TypeError: 'int' object is not iterable
when the code hits the test:
for i, d in enumerate(dataloader):

Based on this I have two questions:

  1. Is torch.utils.data.sampler.SequentialSampler the appropriate sampler to use for a sliding window?
  2. Can anyone point me to a good example of configuring an LSTM using a DataLoader to load numeric data? All the examples I have seen are for NLP.

Thanks

1 Like

The SequantialSampler samples your data sequentially in the same order.
To use a sliding window, I would create an own Dataset and use the __getitem__ method to get the sliding window.
Here is a small example (untested):

class MyDataset(Dataset):
    def __init__(self, data, window):
        self.data = data
        self.window = window

    def __getitem__(self, index):
        x = self.data[index:index+self.window]
        return x

    def __len__(self):
        return len(self.data) - self.window
13 Likes

thank you! that’s awesome!