Hello! I have been doing some research regarding plane trajectories prediction. I want to test the performance of different network architectures with points taken every 30/60/90/120 seconds. In particular, from a sequence of the last n increments in three features (longitude, latitude, and height), I try to predict the next increment. Since in most examples the plane is following a relatively stright line, I expected results to be good with the exception of some complex cases where the plane is turning or doing some maneuver. However, the results are terrible overall.
Here is an example of input (sequence of 4 elements, each with 3 features):
xs_batch tensor([[-0.0156, -0.0226, -0.0750], [-0.0146, -0.0226, -0.0750], [-0.0161, -0.0226, -0.0750], [-0.0139, -0.0226, -0.0750]], device='cuda:0')
And an example of intended output (prediction of the next value for the 3 features):
ys_batch tensor([-0.0147, -0.0226, -0.0750], device='cuda:0')
I have tested first with a simple network that should at the very least learn that the predicted increment is very similar to the last one in the sequence:
# Function for creating generic blocks of linear layers with activations and optional dropout def create_dense_block(input_size, output_size, hidden_sizes:list, dropout_rate=0.1): layers =  input_sizes = [input_size,] + hidden_sizes output_sizes = hidden_sizes + [output_size,] for iz, oz in zip(input_sizes, output_sizes): if(dropout_rate > 0): layers.append(nn.Dropout(dropout_rate)) layers.append(nn.Linear(iz, oz)) layers.append(nn.Tanh()) return nn.Sequential(*layers) # The feedforward architecture class DenseNetwork(nn.Module): def __init__(self, number_features, sequence_size): super(DenseNetwork, self).__init__() # block of linear layers that ends with an output of size 100 self.fc1 = create_dense_block(number_features*sequence_size, 100, [500,500,400,200], dropout_rate=0) # Final linear layer without activation at the end to allow negative results self.fc2 = nn.Linear(100, number_features) def forward(self, x:torch.Tensor): shape = x.shape # reshaping to turn sequences of x elements with y features into x*y features x = x.reshape((shape, -1)) x = self.fc1(x) x = self.fc2(x) return x
The training code is here:
model = architectures.DenseNetwork(len(features_x), seq_size) loss_function = nn.MSELoss() optimizer = torch.optim.Adam(model.parameters(), learning_rate) scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, verbose=True, threshold=1e-2) model.to(device) model.train() for epoch in range(num_epochs): for batch in loader_training: xs_batch:torch.Tensor = batch["inputs"] ys_batch:torch.Tensor = batch["labels"] xs_batch = xs_batch.to(device).float() ys_batch = ys_batch.to(device).float() model.zero_grad() out = model(xs_batch) loss = loss_function(out, ys_batch) loss.backward() optimizer.step()
The training data contains mora than 10.000 examples. I have tested with more and less layers, smaller and bigger, with higher and lesser learning rate, confirming that the shapes of the output and at every single step of the network are correct:
xs_batch.shape torch.Size([128, 4, 3]) ys_batch.shape torch.Size([128, 3]) out.shape torch.Size([128, 3])
And yet the results are quite bad and bizarre at times, so I wonder if there is some detail related to parameters or similar that I’m missing.