Oct 10- Hy guys, I am trying to use this code to predict the evolution of a signal that has a frequency that is a function of time. The idea is that i generate some samples (6000) and after performing the usual preprocessing passages (scaling, dividing into testing and training data) I try to use a window of 100 prior datapoints and no features to predict the following single datapoint.
Even if my network is simple it does not seem able to get a meaningful prediction and this can be easily seen by running the fitting, that shows that the testing and training losses do not decrease.
What am I doing wrong?
I am posting a link to my notebook, hosted on github
Write me if you can’t access it, it’s my first time using git.
can you make your question more specific?
First of all thanks for reaching out, sorry if I am not being clear.
I am performing to perform a prediction on a series of datas, all belonging to the same time-series (which I generated thanks to a sinusoidal function, the frequency of which is time-dependent).
In order to do so I build a LSTM NN, implemented a simple training algorithm to which I feed my preprocessed datas, which are been formatted to tensors.
I appended the model and training algorithm, but if you manage to look at my code you will surely see the problem in an instant.
Thank you for the attention
class LSTM(nn.Module):
def __init__(self, n_features, n_hidden, seq_len, n_layers):
super(LSTM, self).__init__()
self.n_hidden = n_hidden
self.seq_len = seq_len
self.n_layers = n_layers
#lstm_input.shape =(seq_len=numero sequenze di input, batch_size=numero elementi ciascuna sequenza, input_size=terza
#dimensione sequenza
#output.shape=(seq_len, batch, num_directions * hidden_size) con una dimensione
self.lstm = nn.LSTM(
input_size=n_features,
hidden_size=n_hidden,
num_layers=n_layers,
)
#pass output of lstm to the linear
#output of the lstm is passed down to the linear, which performs a linear transf with random weights and bias(until update)
self.linear = nn.Linear(in_features=n_hidden, out_features=1)
def reset_hidden_state(self):
#it generates a tensor which is made of 2 3-dimensional tensors side by side
self.hidden = (torch.zeros(self.n_layers, self.seq_len, self.n_hidden), torch.zeros(self.n_layers, self.seq_len, self.n_hidden))
def forward(self, sequences):
#we pick all the sequences and pass them to te LSTM at once
#lstm_out, self.hidden = self.lstm(sequences.view(len(sequences), self.seq_len, -1), self.hidden)
lstm_out, self.hidden = self.lstm(sequences.view(len(sequences), 1, -1), self.hidden)
#print(lstm_out.shape)
#view keeps all original data while changing their shape into a 3-dimensional tensor
last_time_step = lstm_out.view(self.seq_len, len(sequences), self.n_hidden)[-1]
#print(last_time_step.shape)
#we pass the output of the last time stem to the linear operator, to get the prediction
#y_pred = self.linear(last_time_step)
y_pred = self.linear(lstm_out.view(len(sequences),-1))
return y_pred
def train_model(model, train_data, train_labels, test_data=None, test_labels=None):
#define loss unction as MSE and optimiser as adam
loss_fn = torch.nn.MSELoss(reduction='sum')
optimiser = torch.optim.Adam(model.parameters(), lr=1e-3)#learning rate
num_epochs = 60
#initialize to 0 the test and train history
train_hist = np.zeros(num_epochs)
test_hist = np.zeros(num_epochs)
#at each epoch we reset the hidden state, run the whole set, compute the loss
for t in range(num_epochs):
model.reset_hidden_state()
optimiser.zero_grad()
y_pred = model(X_train)
loss = loss_fn(y_pred.float(), y_train)
if test_data is not None:
with torch.no_grad(): #disables gradient calculation for subsequent lines of code
#predict the output thanks to the model, the compute the loss thanks to verify data
y_test_pred = model(X_test)
test_loss = loss_fn(y_test_pred.float(), y_test)
test_hist[t] = test_loss.item()
#printing progress
if t % 1 == 0:
print(f'Epoch {t} train loss: {loss.item()} test loss: {test_loss.item()}')
elif t % 1 == 0:
print(f'Epoch {t} train loss: {loss.item()}')
#including the losses in the train history
train_hist[t] = loss.item()
#we reset the previous gradient, compute the new one and give a forward step to the optimizer
#optimiser.zero_grad()
loss.backward()
optimiser.step()
return model.eval(), train_hist, test_hist
your model looks good to me, maybe your input has some problem
I don’t know which shape X_train
has, but the sequences.view
operation might be wrong, if you are trying to permute the dimensions.
By default nn.LSTM
expects the input in the shape [seq_len, batch_size, features]
. If you want to permute an input of [batch_size, seq_len, features]
to this shape, use sequences = sequences.permute(1, 0, 2)
instead.
The same applied for the lstm_out
tensor, which has the shape [seq_len, batch_size, num_directions*hidden_size]
by default.
my data are a time-series of 1000 elements, created as
# Number of sample points
N = 1000
# sample spacing
T = 1.0 / 500.0
time = np.linspace(0.0, N*T, N)
new_datas=[]
for i in range(len(time)):
new_data= np.sin((4.0+6*(i/N)) * 2.0*np.pi*time[i])+np.random.normal(scale=0.25)
new_datas.append(new_data)
that I then divide in sequences with
#dataframe function
def create_sequences(data, seq_length):
xs = []
ys = []
for i in range(len(data)-seq_length-1):
x = data[i:(i+seq_length)]
y = data[i+seq_length]
xs.append(x)
ys.append(y)
return np.array(xs), np.array(ys)
where seq_length is the length of the sequence I am supplying and so the first sequence sholud look like
[s0,s1,…s99] and [s100] is my first label,
then the following sequence will be:
[s1, …s100] and the second label will be [s101]and so on and so forth.
so the datas will be in the shape of [number of sequences, sequence_length]
after converting to tensors and reshaping it should look like [number of sequences, sequence length, features] where features is 1
So, if my division is 80% train data and the rest test data, my two dataset will have the dimension 800 and 200 respectively, with 1000 data starting dataset.
once I split them with the above function I will have
X_train.shape = (699, 100, 1) = number of sequences, length of each sequence, n of features y_train.shape = (699, 1) = number of sequences, n of features
torch.Size([699, 100, 1]) torch.Size([699, 1])
torch.Size([99, 100, 1]) torch.Size([99, 1])
Thanks for the information. In that case, you should either use batch_first=True
while creating the nn.LSTM
module or use the permute
approach as mentioned before.
Your current code:
sequences.view(len(sequences), 1, -1)
would reshape the data to [699, 1, 100]
, which is wrong for the current setup of your nn.LSTM
module (explained in the previous post) and would also move the temporal dimension to the features.