I have a movement model for which I can simulate data from. It has two parameters, and the output is latitude and longitude at N evenly spaced time points. My goal is to use a convolutional neural network to train the relationship between the parameters P and the simulated data. This answer addresses the case for which the data is handled as a grid, where each row is a latitude, the columns are longitudes, and the values are the number of occurrences at that lat/lon.
But now I no longer wish to ignore time. I wish to set it up such that I have one long vector, where one position to the next is a time increment, and there are two values at each position: lat and long. Is this where “channels” come in? In the previous case, number of channels was 1- I’m thinking this should be 2 now.
The below code is for the case where I only look at lat or longitude- 1d. How can I edit this to handle both lat and long?
As you’ll see, my input training dimensions are: [399,1,501,1] for # of samples (simulated datasets), channels, rows (position at time x), nothing.
Thank you in advance!
>>> datalist_train.shape
torch.Size([599, 1, 501, 1])
>>> datalist_test.shape
torch.Size([399, 1, 501, 1])
>>> params_train.shape
torch.Size([599, 2])
>>> params_test.shape
torch.Size([399, 2])
>>> datalist_train.dtype
torch.float64
>>> datalist_test.dtype
torch.float64
>>> params_train.dtype
torch.float64
>>> params_test.dtype
torch.float64
n_parameters = 2
#
# Define a simple CNN that takes in (B, 1, grid_size, grid_size) and outputs (B, 2 parameters)
#
def create_model(n_output_parameters, hidden_size=16):
model = nn.Sequential(
#Normalize the input batch
nn.BatchNorm1d(num_features=1),
#Learn hidden_size of 3x3 kernels that detect features
nn.Conv1d(in_channels=1, out_channels=hidden_size, kernel_size=3, padding='same'),
nn.ReLU(),
nn.BatchNorm1d(hidden_size),
#Halve the spatial dimensions, keeping the max values per quadrant
nn.MaxPool1d(2),
#Average the spatial dimensions to a single scalar, and drop the redundant dims
nn.AdaptiveAvgPool1d(1),
nn.Flatten(start_dim=1),
#Map the input shape (B, hidden_size) to (B, n_output_parameters)
nn.Linear(in_features=hidden_size, out_features=hidden_size),
nn.ReLU(),
nn.Linear(in_features=hidden_size, out_features=n_output_parameters),
)
return model
cnn_model = create_model(n_parameters)
#Report CNN size
print(
'CNN has',
sum(p.numel() for p in cnn_model.parameters() if p.requires_grad),
'trainable parameters'
)
#
# Prepare data for model
# 1. convert to tensors
# 2. add a channels dimension
# 3. wrap in DataLoader, to get batched samples
#
#CNN expects the input format (batch/samples, channels, height, width)
# We currently have (B, H, W), so add a singleton channels dimension to
# get (B, C=1, H, W)
datalist_train, datalist_test=[tensor.unsqueeze(dim=1) for tensor in (datalist_train, datalist_test)]
#Train and val datasets, returning (Ai, Pi) pair for each index.
# i.e. it pairs together the Ai and Pi of each sample.
train_dataset = TensorDataset(datalist_train, params_train)
val_dataset = TensorDataset(datalist_test, params_test)
train_i,params_i=train_dataset[0]#returns the Ai and Pi for sample 0
print('Sample 0 (A0, P0) from train_dataset:', train_i.shape, ',', params_i.shape)
#Batchify
batch_size = 32
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
#
# Train and evaluate cnn_model
#
#For reproducibility
np.random.seed(0)
torch.manual_seed(0)
cnn_model = create_model(n_parameters)
optimizer = torch.optim.Adam(cnn_model.parameters())
loss_function = nn.MSELoss()
#
# Training loop
#
n_epochs = 12
#Used for recording losses and metrics
from collections import defaultdict
metrics_dict = defaultdict(list)
for epoch in range(n_epochs):
cnn_model.train()
for minibatch in train_loader:
A_minibatch, P_minibatch = minibatch
P_predicted = cnn_model(A_minibatch)
loss = loss_function(P_predicted, P_minibatch)
#step optimizer
optimizer.zero_grad()
loss.backward()
optimizer.step()
#/end of epoch
#
# Evaluate cnn_model per epoch
#
cnn_model.eval()
with torch.no_grad():
P_predicted_train = cnn_model(datalist_train)
P_predicted_val = cnn_model(datalist_test)
train_loss = loss_function(P_predicted_train, params_train).item()
val_loss = loss_function(P_predicted_val, params_test).item()
val_mae = nn.L1Loss()(P_predicted_val, params_test).item()
metrics_dict['epoch'].append(epoch + 1)
metrics_dict['train_loss'].append(train_loss)
metrics_dict['val_loss'].append(val_loss)
metrics_dict['val_mae'].append(val_mae)
#
# Report results
#
print(
f'[epoch {epoch + 1:2d}/{n_epochs}]',
f'[train loss: {train_loss:5.3f} | val loss: {val_loss:5.3f}]',
f'[val MAE: {val_mae:5.3f}]'
)