I am attempting to use an LSTM to classify what the type of weather on a particular day at a specific location on features e.g. humidity, temperature, etc . Therefore my data takes the following format: [locations, days_for_that_location, number_of_dimensions] which turns out to be [100, x, 8]. Where x is anywhere between 900 and 6000 - I have more data for some locations than I do for others. I have to classify each day into Sunny, Rainy, Snowy, etc as a result the label dataset has size [locations, days_per_location, 1] which turns out to be [100, x, 1]. Where the third dimension is a number from 0 to 6, each number representing a type of weather - so that I can use cross entropy loss. Below follows what I built - which I do not think is correct.
In some sense I am trying to use PoS tagging techniques for this.
from torch.utils.data import Dataset
class WeatherData(Dataset):
def __init__ (self, location):
self.samples = []
for day in location:
self.samples.append((day['features'], day['lables']))
def __len__(self):
return len(self.samples)
def __getitem__(self, idx):
return(self.samples[idx])
BATCH_SIZE = 1
DatasetW= WeatherData(Data)
train_iterator = torch.utils.data.DataLoader(DatasetW, batch_size = BATCH_SIZE)
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.lstm1 = nn.LSTM(input_size = 8, hidden_size = 32, num_layers = 2, dropout = 0.5)
self.fc1 = nn.Linear(32, 120)
self.fc2 = nn.Linear(120,9)
self.softmax = nn.Softmax(dim=1)
def forward(self, x):
x = x.unsqueeze(0)
x = x.unsqueeze(0)
x, _ = self.lstm1(x)
x = F.dropout(x, p = 0.9)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.softmax(x)
return x
net = Net()
net = net.double()
import torch.optim as optim
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
Where the issue is most likely - I should not be passing in each data point one at a time.
for epoch in range(100): # loop over the dataset multiple times
running_loss = 0.0
o_lst = []
l_lst = []
for i, data in enumerate(train_iterator, 0):
inputs, labels = data
optimizer.zero_grad()
for data_point, lab in zip(inputs[0], labels[0][0]):
outputs = net(data_point)
lab = lab.unsqueeze(0)
outputs = outputs.squeeze(0)
loss = criterion(outputs, lab)
loss.backward()
optimizer.step()
running_loss += loss.item()
o_lst.append((outputs.argmax().item()))
l_lst.append(lab.item())
print("Epoch: ", epoch, "Loss: ",running_loss)
print('Finished Training')
I am certain this is wrong because I am only passing in one data-point a time and I believe I should be using the LSTM differently. Confused about how to create a dataset and a subsequent model that has variable input length and take advantage of batching if possible.
The current model works in the sense that it runs but it does not learn - the loss is stuck - which to be honest is not surprising.