I am trying to make a simple regressor work as a function optimiser, for didactic purposes.
I found some posts on the topic, e.g. here but I eventually got stuck.
I attach a MWE below.
I get a dummy dataset, and I train a simple regressor.
Then, I (tried) to freeze the weights, and optimise over the inputs, after setting requires_grad = True
. I simply define the loss as the model prediction, and try to propagate backwards. I am not sure though, how to save the inputs, to re-use them as a starting point for the next iteration.
I am also unsure on how/if to add inputs to the optimiser, with optimizer.add_param_group
, as I do get an error (see please below).
Here the simplest example. the function is a simple quadratic, the minimum is then a null tensor.
import copy
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
import tqdm
from sklearn.model_selection import train_test_split
### Create dummy dataset
X = np.random.rand(1000,5)
y = np.apply_along_axis(lambda x: x[0]**2 + x[1]**2 , axis = 1, arr = X)[:, None]
X, y = torch.tensor(X, dtype=torch.float32), torch.tensor(y, dtype=torch.float32)
# train-test split for model evaluation
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.85, shuffle=True)
model = nn.Sequential(
nn.Linear(5, 8),
nn.ReLU(),
nn.Linear(8, 12),
nn.ReLU(),
nn.Linear(12, 6),
nn.ReLU(),
nn.Linear(6, 1)
)
# loss function and optimizer
loss_fn = nn.MSELoss() # mean square error
optimizer = optim.Adam(model.parameters(), lr=0.0001)
n_epochs = 50 # number of epochs to run
batch_size = 1 # size of each batch
batch_start = torch.arange(0, len(X_train), batch_size)
# Hold the best model
best_mse = np.inf # init to infinity
best_weights = None
history = []
###
### TRAINING
###
for epoch in range(n_epochs):
model.train()
with tqdm.tqdm(batch_start, unit="batch", mininterval=0, disable=True) as bar:
bar.set_description(f"Epoch {epoch}")
for start in bar:
# take a batch
X_batch = X_train[start:start+batch_size]
y_batch = y_train[start:start+batch_size]
# forward pass
y_pred = model(X_batch)
loss = loss_fn(y_pred, y_batch)
# backward pass
optimizer.zero_grad()
loss.backward()
# update weights
optimizer.step()
# print progress
bar.set_postfix(mse=float(loss))
#### OPTIMISE OVER INPUT
# Freeze weights
for param in model.parameters():
param.requires_grad = False
# Input to optimise, guess vaue for first iteration
X_0 = X_train[0]
# Set flag requires_grad to True
X_0.requires_grad = True
### Input optimisations
INPUT_OPTIMISATION_ITER = 200
for epoch in range(INPUT_OPTIMISATION_ITER):
model.train()
with tqdm.tqdm(batch_start, unit="batch", mininterval=0, disable=True) as bar:
bar.set_description(f"Epoch {epoch}")
for start in bar:
# take a batch
# forward pass
y_pred = model(X_0)
### loss = loss_fn(y_pred, y_batch)
loss = y_pred
# backward pass
optimizer.zero_grad()
loss.backward(retain_graph=True)
# update weights
optimizer.step()
# print progress
bar.set_postfix(mse=float(loss))
Assuming the above is correct, where do I find the actual optimised inputs at the end of the loop, to re-sue them for the next iteration?
I am also unsure on how to add inputs to the optimiser, with optimizer.add_param_group
perhaps, is this needed at all??
If I do
optimizer.add_param_group({"params": X_0})
I get the error
ValueError: some parameters appear in more than one parameter group
Thanks