Hey everyone,
I am working on a private dataset to forecast patient visits at the Emergency Department.
I tried to create an Encoder-Decoder model based on LSTM (Seq2Seq) with General Attention. However, I am facing some issues running the code and have a few questions because I am not used to working with sequences.
- My dataset has 10 columns, one of which is the target variable.
- Should my X tensor be of shape (batch_size, sequence_length, 9) or (batch_size, sequence_length, 10) including the target?
- Should my y tensor be of shape (batch_size, forecasting_horizon) or (batch_size, seq, forecasting_horizon)?
- I can’t share the dataset, but here is some code for context:
class Seq2Seq(nn.Module):
def __init__(self, encoder, decoder, attention, teacher_ratio, bidirectional=1, device="cpu"):
super(Seq2Seq, self).__init__()
self.encoder = encoder
self.decoder = decoder
self.attention = attention
self.fc = nn.Linear(self.encoder.hidden_size*bidirectional, 1)
self.device = device
self.teacher_ratio = teacher_ratio
def _get_top_layer_hidden_state(self, hidden_state):
hidden_state, _ = hidden_state
return hidden_state[-1, :, :]
def forward(self, batch):
x, y = batch
x = x.to(self.device)
encoded_outputs, hidden_state = self.encoder(x)
y = y.unsqueeze(2)
y_hat = torch.zeros_like(y, device=y.device)
dec_input = x[:, -1:, :]
for i in range(y.size(1)):
top_hidden_state = self._get_top_layer_hidden_state(hidden_state)
context = self.attention(top_hidden_state.unsqueeze(1), encoded_outputs)
dec_input = torch.cat((dec_input, context.unsqueeze(1)), dim=-1)
output, hidden_state = self.decoder(dec_input, hidden_state)
output = self.fc(output)
y_hat[:, i, :] = output.squeeze(1)
# Improved teacher forcing implementation:
teacher_force = random.random() < self.teacher_ratio
if teacher_force:
# Use the ground truth token for better training stability
dec_input = y[:, i, :].unsqueeze(1)
else:
# Use the model's prediction as the next decoder input
dec_input = output
dec_input = dec_input.to(x.device)
return y_hat, y
model = Seq2Seq(encoder=nn.LSTM(input_size, hidden_size, num_layers=num_layers, dropout=0., batch_first=True),
decoder=nn.LSTM(input_size+hidden_size, hidden_size, num_layers=num_layers, dropout=0., batch_first=True),
attention=GeneralAttention(encoder_dim=hidden_size, decoder_dim=hidden_size),
teacher_ratio=0.3,
bidirectional=1,
device=device).to(device)
loss_fn = nn.MSELoss()
optimizer = optim.AdamW(model.parameters(), lr=learning_rate, weight_decay=weight_decay)
train_loader = train_dataset.tensor_loader.loader
val_loader = val_dataset.tensor_loader.loader
best_val_loss = float('inf')
current_patience = 0
best_model_state_dict = None
# Training loop
for epoch in range(num_epochs):
model.train()
training_losses = []
for batch in train_loader:
optimizer.zero_grad()
y_hat, y = model(batch)
loss = loss_fn(y_hat, y)
loss.backward()
optimizer.step()
training_losses.append(loss.item())
....
When running the forward method in the first epoch, my code executes entirely for i=0. However, I encounter an issue when i=1.
As you can see, dec_input
changes shape:
teacher_force = random.random() < self.teacher_ratio
if teacher_force:
# Use the ground truth token for better training stability
dec_input = y[:, i, :].unsqueeze(1)
else:
# Use the model's prediction as the next decoder input
dec_input = output
When i=0, my dec_input
is fed into self.decoder(...)
with shape (64, 1, 290), where 64 is my batch size, 1 comes from the last hidden state of encoder_input
, and 290 is 280 hidden_size + 10 input feature size.
However, after the teacher forcing, dec_input
changes to (64, 1, 1), resulting in a dec_input
of (64, 1, 281), causing an error when trying to decode:
RuntimeError: input.size(-1) must be equal to input_size. Expected 290, got 281
dec_input = torch.cat((dec_input, context.unsqueeze(1)), dim=-1)
output, hidden_state = self.decoder(dec_input, hidden_state)
This occurs because my decoder is set to have an input_size of encoder_input_size + hidden_size
. I am missing some understanding of the Seq2Seq concept and its behavior.
Thanks for your help!