Help please! How to feed RNN with two time steps/inputs?

Hello there! Here’s my problem:

I’m trying to use RNN for classification problem in which I have 25 classes. The dataset is conform of 1500 samples that are vectors of 800 columns, so each input has shape (1, 800).
For each class I have a number of samples and because each element of the vector represents RSS each of the samples for each class is correlated.

What I want to do is to feed the RNN with two vectors each one with the respect label/target, so the hidden state can be shared between the first and the second so I take advantage of RNN’s sequential information.
How ever I don’t know how to express that, Do I have to create the “sequences” in the dataset? Or it is something about feeding the model?

Thanks!!

1 Like

RNNs in PyTorch expect the input to have a temporal dimension. The default input shape would be [seq_len, batch_size, features], where seq_len defines the temporal sequence length.
You could use batch_first=True, which would then expect an input of [batch_size, seq_len, features].
The RNN would execute the corresponding operations on the complete sequence length automatically.

Thanks, I get that, however I don’t know how to prepare my data to adopt that shape.
I create the dataset with:

class WifiDataset(Dataset):
    def __init__(self, data,labels=None,transforms=None):
        self.X = data
        self.y = labels
        self.transforms = transforms    
    def __len__(self):
        return (len(np.asarray(self.X)))
    def __getitem__(self, i):
        data = self.X[i]        
        if self.transforms:
            data = self.transforms(data)
              if self.y is not None:
            return (data, self.y[i])
        else:
            return data

With this I create my dataset and then my dataloader, where my data,target in loader
is like [batch_size, 1, input_size (=800)] and target = [batch_size].
I dont know if is after I create the dataloader where I have to do something to turn inputs to [seq_len, batch_size, featers] or if it is in the dataset creation where I have to set the data to get seq_len data instead of jus one.
Thanks!

I tried also coding like this (it’s an example to test how to make sequences):

class DataPrueba(Dataset):
    def __init__(self, data, seq, labels=None, transforms=None):
        self.X = data
        self.y = labels
        self.seq = seq
        self.transforms = transforms

    def __len__(self):
        return len(np.asarray(self.X))

    def __getitem__(self, i):
        if self.y is not None:
            if i + self.seq > self.__len__():
                if self.transforms is not None:
                    item = []
                    item[:self.__len__()-i] = self.transforms(self.X[i:])
                    target_item = []
                    target_item[:self.__len__()-i] = self.transforms(self.y[i:])
                    data = item
                    labels = target_item
                else:
                    item = []
                    item[:self.__len__()-i] = self.X[i:]
                    data = item
                    target_item = []
                    target_item[:self.__len__()-i] = self.y[i:]
                    labels = target_item
            else:
                if self.transforms is not None:
                    data = self.transforms(self.X[i:i+self.seq])
                    labels = self.transforms(self.y[i:i+self.seq])
                else:
                    data = self.X[i:i+self.seq]
                    labels = self.y[i:i+self.seq]
            return data, labels
        else:
            if i + self.seq > self.__len__():
                if self.transforms is not None:
                    item = []
                    item[:self.__len__()-i] = self.transforms(self.X[i:])
                    data = item
                else:
                    item = []
                    item[:self.__len__()-i] = self.X[i:]
                    data = item
            else:
                if self.transforms is not None:
                    data = self.transforms(self.X[i:i+self.seq])
                else:
                    data = self.X[i:i+self.seq]
            return data
        
        
x = torch.randn(6, 10)
labels = torch.randint(0, 6, (6,))

dataset = DataPrueba(x, seq=2, labels=labels, transforms=None)       
loader = DataLoader(dataset, batch_size=2)
in_size = x.shape[1]

for i, j in loader:
    print(i)
    print(i.shape)
    print(j)
    print(j.shape)
    break
        

model = nn.Sequential(nn.LSTM(input_size=10, hidden_size=35,num_layers=1, batch_first=True)),
                      nn.Linear(35, 6))

input_seq = i
output_seq, _ = model(torch.FloatTensor(input_seq))
last_output = output_seq[-1]

loss = nn.CrossEntropyLoss()
err = loss(last_output, j)
err.backward()

However if the Linear layeres is uncommented I get this error, that I guess t happens because the input is two sequences:

Traceback (most recent call last):

  File "/home/lauram/Desktop/RNN_TIMESEQ.py", line 110, in <module>
    output_seq, _ = model(torch.FloatTensor(input_seq))

  File "/home/lauram/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)

  File "/home/lauram/anaconda3/lib/python3.8/site-packages/torch/nn/modules/container.py", line 117, in forward
    input = module(input)

  File "/home/lauram/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)

  File "/home/lauram/anaconda3/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 93, in forward
    return F.linear(input, self.weight, self.bias)

  File "/home/lauram/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 1688, in linear
    if input.dim() == 2 and bias is not None:

AttributeError: 'tuple' object has no attribute 'dim'

And when the Linear layer is commented then the CrossEntropyLoss cant be applied because as the input is two sequences is a 2D tensor, while CrossEntropyLoss expects just 1D tensor…

Traceback (most recent call last):

  File "/home/lauram/Desktop/RNN_TIMESEQ.py", line 114, in <module>
    err = loss(last_output, j)

  File "/home/lauram/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)

  File "/home/lauram/anaconda3/lib/python3.8/site-packages/torch/nn/modules/loss.py", line 961, in forward
    return F.cross_entropy(input, target, weight=self.weight,

  File "/home/lauram/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 2468, in cross_entropy
    return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)

  File "/home/lauram/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 2264, in nll_loss
    ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)

RuntimeError: 1D target tensor expected, multi-target not supported

Any idea on how to fix it? :pensive:

The idea of creating the sequences inside the Dataset is valid and you could either stick to your loop or have a look at e.g. this simple post to see, how to create “sliding windows”.

I would not recommend to put an nn.LSTM into an nn.Sequential container, since it’s working with multiple inputs and outputs.
Create a custom nn.Module instead so that you could use e.g. the last time step of the lstm output to feed into the linear layer.

I assume you are using the complete hidden state as the model output, which won’t work in this case.
If each sequence has a single target, you could e.g. use the last time step of the lstm output or reduce the temporal dimension in any other way (e.g. taking the mean).

I get it, thanks. I’m using some kind of “sliding windows” however, when using this sliding window, the last element of the (i,j) in loader does not have the same dimensions as the previous ones. I mean, when idx+seqlength (seqlength is like sliding window in my example) is > than self.__len__ then the last data returned by get item has just idx elements. I mean, imagine sequence length is 2, then if when creating the dataset:

  • input size = 10
  • batch_size = 2
  • sequence_length = 2
  • nº of samples = 6
    then in the data, target in loader would return:
  • data.shape = [2,2,10] target shape [2,2] for the first iter
  • data.shape = [2,2,10] target shape [2,2] for the second iter
    and for the third iter then it will return an error like:
    RuntimeError: stack expects each tensor to be equal size, but got [2, 10] at entry 0 and [1, 10] at entry 1
    because the last sequence length condition is not satisfied. How could I solve that? My data are vectors with values 0 and target values 0 as well so I don’t think padding with zeros would be something reliable because when training the dataset it would be mistaking the padded sequences with some target and data. Below is the code I’m using (it’s a test code because my dataset is much larger so I rather prefer trying with something smaller first). Thanks!
class DataPrueba(Dataset):
    def __init__(self, data, seq, labels=None, transforms=None):
        self.X = data
        self.y = labels
        self.seq = seq
        self.transforms = transforms

    def __len__(self):
        return len(np.asarray(self.X))

    def __getitem__(self, i):
        if self.y is not None:
            if i + self.seq > self.__len__():
                if self.transforms is not None:
                    item = []
                    item[:self.__len__()-i] = self.transforms(self.X[i:])
                    target_item = []
                    target_item[:self.__len__()-i] = self.transforms(self.y[i:])
                    data = item
                    labels = target_item
                else:
                    data = torch.zeros_like(self.X[i:])
                    data[:self.__len__()-i] = self.X[i:]
                    labels = torch.zeros_like(self.y[i:])
                    labels[:self.__len__()-i] = self.y[i:]
            else:
                if self.transforms is not None:
                    data = self.transforms(self.X[i:i+self.seq])
                    labels = self.transforms(self.y[i:i+self.seq])
                else:
                    data = self.X[i:i+self.seq]
                    labels = self.y[i:i+self.seq]
            return torch.FloatTensor(data), labels
        else:
            if i + self.seq > self.__len__():
                if self.transforms is not None:
                    item = []
                    item[:self.__len__()-i] = self.transforms(self.X[i:])
                    data = item
                else:
                    item = []
                    item[:self.__len__()-i] = self.X[i:]
                    data = item
            else:
                if self.transforms is not None:
                    data = self.transforms(self.X[i:i+self.seq])
                else:
                    data = self.X[i:i+self.seq]
            return torch.FloatTensor(data)
        
        
x = torch.randn(6, 10)
labels = torch.randint(0, 6, (6,))
num_classes = 6;
hidden_size = 35
batch_size = 2
seq = 2
num_layers=1

transform = transforms.Lambda(lambda x: listToTensor(x))
dataset = DataPrueba(x, seq=seq, labels=labels, transforms=None)       
loader = DataLoader(dataset, batch_size=batch_size, drop_last=True)
in_size = x.shape[1]

for idx, (i, j) in enumerate(loader):
    print(idx)
    print(str(i)+'\t'+str(i.shape))
    print(str(j)+'\t'+str(j.shape))
    
# Fully connected neural network with one hidden layer
class RNN(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, num_classes):
        super(RNN, self).__init__()
        self.num_layers = num_layers
        self.hidden_size = hidden_size
        self.rnn = nn.RNN(input_size, hidden_size, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_size, num_classes)
        
    def forward(self, x):
        # Set initial hidden states (and cell states for RNN/LSTM)
        h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size)
        # Forward propagate RNN
        out, _ = self.rnn(x, h0)  
        out = self.fc(out)
        return out

model = RNN(in_size, hidden_size, num_layers, num_classes)
criterion = nn.CrossEntropyLoss()

for epoch in range(10):
    for i, (images, labels) in enumerate(loader):  
        # origin shape: [N, 1, 28, 28]
        # resized: [N, 28, 28]
        images = images.reshape(-1, seq, in_size)
        print(images.shape)
        # Forward pass
        outputs = model(images)
        loss = criterion(outputs.permute(0,2,1), labels)

In the posted code snippet I’ve reduced the length of the Dataset by the window so that only “full” windows would be returned. I think this would also work for you.

I tried doing that by replacing the sentence

        return len(np.asarray(self.X))

with

        return len(self.X) - self.seq

but this error comes if I do that:

  File "/home/lauram/Desktop/RNN_TIMESEQ.py", line 108, in <module>
    for idx, (i, j) in enumerate(loader):

  File "/home/lauram/anaconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in __next__
    data = self._next_data()

  File "/home/lauram/anaconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 475, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration

  File "/home/lauram/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]

  File "/home/lauram/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]

  File "/home/lauram/Desktop/RNN_TIMESEQ.py", line 57, in __getitem__
    data[:self.__len__()-i] = self.X[i:]

RuntimeError: The expanded size of the tensor (1) must match the existing size (3) at non-singleton dimension 0.  Target sizes: [1, 10].  Tensor sizes: [3, 10]

This error is expected now, since you’ve reduced the __len__ of the dataset, and this slicing will thus fail:

data = torch.zeros_like(self.X[i:])
data[:self.__len__()-i] = self.X[i:]

I’m currently unsure, why you are initializing data as a zero tensor in the shape of X[i:] and assign the values in the next line via a slicing operation.
Wouldn’t data = self.X[i:] just work?

I changed it to:

class DataPrueba(Dataset):
    def __init__(self, data, seq, labels=None, transforms=None):
        self.X = data
        self.y = labels
        self.seq = seq
        self.transforms = transforms
    def __len__(self):
        return len(self.X) - self.seq
    def __getitem__(self, i):
        if self.y is not None:
            if i + self.seq > self.__len__():
                if self.transforms is not None:
                    item = []
                    item[:self.__len__()-i] = self.transforms(self.X[i:])
                    target_item = []
                    target_item[:self.__len__()-i] = self.transforms(self.y[i:])
                    data = item
                    labels = target_item
                else:
                    data = self.X[i:]
                    labels = self.y[i:]

However I still get the error expected above:

  File "/home/lauram/Desktop/RNN_TIMESEQ.py", line 113, in <module>
    for idx, (i, j) in enumerate(loader):

  File "/home/lauram/anaconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in __next__
    data = self._next_data()

  File "/home/lauram/anaconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 475, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration

  File "/home/lauram/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)

  File "/home/lauram/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/collate.py", line 83, in default_collate
    return [default_collate(samples) for samples in transposed]

  File "/home/lauram/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/collate.py", line 83, in <listcomp>
    return [default_collate(samples) for samples in transposed]

  File "/home/lauram/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/collate.py", line 55, in default_collate
    return torch.stack(batch, 0, out=out)

RuntimeError: stack expects each tensor to be equal size, but got [2, 10] at entry 0 and [3, 10] at entry 1

I also though data = self.X[i:] would work, however (maybe because of the slicing) the next error appears:

  File "/home/lauram/Desktop/RNN_TIMESEQ.py", line 57, in __getitem__
    data[:self.__len__()-i] = self.X[i:]

UnboundLocalError: local variable 'data' referenced before assignment 

However it should have been because of the slicing, because that error is not happening now.