I’ll start off by saying that I know very little about deep learning but still wanted to try and apply it to some of the work I’ve been doing.
Working through one of the tutorials, I built a NN made up of the following components:
model = torch.nn.Sequential(
torch.nn.Linear(D_in, H),
torch.nn.ReLU(),
torch.nn.Linear(H, D_out),
)
y_pred = model(train_x)
I wanted to use an LSTM network, so I tried to do the following:
model = torch.nn.Sequential(
torch.nn.LSTM(D_in, H),
torch.nn.Linear(H, D_out)
)
y_pred = model(train_x)
This gave me the following error:
RuntimeError: input must have 3 dimensions, got 2
I found that the input expected by an LSTM network is a bit different than a Linear transformation:
input of shape (seq_len, batch, input_size): tensor containing the features of the input sequence. The input can also be a packed variable length sequence
I was able to work around it by splitting my Sequential nn container into two layer, as well as reshaping my input/output to/from the LSTM layer like so:
layerA = torch.nn.LSTM(D_in, H)
layerB = torch.nn.Linear(H, D_out)
train_x = train_x.unsqueeze(0)
y_pred, (hn, cn) = layerA(train_x)
y_pred = y_pred.squeeze(0)
y_pred = layerB(y_pred)
However, simply getting it to work concerns me because I feel like I’m using the nn incorrectly. My questions are as follows:
- How can I use an LSTM network as part of a Sequential container?
- Why is the data to an LSTM network different from that to a Linear one? What is the significance of the outer most dimension?
- What is the correct way to use DataLoader in conjunction with an LSTM network? I’m using the default DataLoader, which doesn’t seem to play way with nn.LSTM:
test_x_data = torch.FloatTensor(x)
test_dataset = data_utils.TensorDataset(test_x_data, test_y_data)
test_loader = data_utils.DataLoader(test_dataset, batch_size=1, shuffle=True)
This is a relatively open-ended question, so I appreciate your time in advance!