I am working on a Time-series classification task. The dataset has 14 features with float values between [0:1] and the label is an integer value, which makes it a multivariate time-series data.
Now I am developing a simple 1D-CNN model followed by fully-connected layers for the classification.
For loading the time-series data for training, I defined MTSDataSet that inherits Dataset class and overwrote the necessary methods as below:
class MTSDataSet(Dataset):
def __init__(self, data_df):
self.data = data_df
def __len__(self):
return len(self.data)
def __getitem__(self, index):
x = np.transpose(torch.from_numpy(self.data.iloc[index:index+sig_length, :-1].values))
y = self.data.iloc[index, -1]
return x, y
In getitem method, I used iloc[index:index+sig_length], where sig_length = 60, to use an input of size 14 (number of features) x 60 (signal length) for the 1D CNN model. Below is how I defined the DataLoader for the training data:
train_loader = DataLoader(train_dataset, batch_size = batch_size, shuffle=True)
And the code below is the definition of the 1D-CNN model followed by fully-connected layers:
class Net0(nn.Module):
def __init__(self):
super(Net0, self).__init__()
self.conv1 = nn.Conv1d(in_channels=14, out_channels=7, kernel_size=10, stride=1) # IN: 14 x 60, OUT: 7 x 51
self.conv2 = nn.Conv1d(in_channels=7, out_channels=4, kernel_size=10, stride=1) # IN: 7 x 51, OUT: 4 x 42
self.fc1 = nn.Linear(4*42, 32)
self.fc2 = nn.Linear(32,len(act_map.keys()))
def forward(self, x):
out = F.relu(self.conv1(x.float()))
out = F.relu(self.conv2(out))
out = out.view(-1, 4*42)
out = F.relu(self.fc1(out))
out = self.fc2(out)
return out
The input data walks through the model without a problem, but I get a RuntimeError whenever I run the training loop:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-132-d7fbe87b310d> in <module>()
5 optimizer = optim.Adam(model.parameters(), lr=lr)
6 loss_fn = nn.CrossEntropyLoss()
----> 7 training_loop(n_epochs = n_epochs, optimizer = optimizer, model = model, loss_fn = loss_fn, train_loader = train_loader)
6 frames
/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py in default_collate(batch)
53 storage = elem.storage()._new_shared(numel)
54 out = elem.new(storage)
---> 55 return torch.stack(batch, 0, out=out)
56 elif elem_type.__module__ == 'numpy' and elem_type.__name__ != 'str_' \
57 and elem_type.__name__ != 'string_':
RuntimeError: stack expects each tensor to be equal size, but got [14, 60] at entry 0 and [14, 26] at entry 15
I think this error is raised because of self.data.iloc[index:index+sig_length, :-1] in the getitem method of my MTSDataset class. Whenever the range [index:index+sig_length] does not include as many data points as sig_length, I would get the error. But I’m not sure how to refine my Dataset or DataLoader definitions to guarantee that the loaded data contains the required number of data points.
Is there any effective or standard way to define a Dataset and DataLoader for 1D-CNN with 2D input data, which may solve the RuntimeError that I’ve got?