Is my nn.Sequential starting warm during each gridsearch loop?

rharwood · January 8, 2025, 10:25pm

I recently migrated my project from sk-learn to pytorch, so my implementation might be way off.

I am successively training an nn with different hyperparameter values to perform a grid search. During this process I am seeing the R^2 score gradually increase for each combination of hyperparameter values. This is not expected. So I am wondering if the nn’s weights are not actually being cleared between iterations. Below is my implementation:

regr = nn.Sequential(
    nn.Linear(6, 20),
    nn.ReLU(),
    nn.Linear(20,20),
    nn.ReLU(),
    nn.Linear(20,20),
    nn.ReLU(),
    nn.Linear(20,20),
    nn.ReLU(),
    nn.Linear(20,1)
).to(device)

for hyperparameter in hyperparameters:
    optimizer = torch.optim.Adam(regr.parameters(), lr=0.001)
    for epoch in range(n_epochs):
        loss = custom_loss()
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

I would have expected each new hyperparameter loop to create a new instance of the optimizer object. This may not be the case. What is a better approach?

I have left out some details. I am hoping this is enough but I am happy to provide more code if not.

ptrblck · January 8, 2025, 11:10pm

You are creating a new optimizer, but you are not creating a new model or resetting its parameters.
loss = custom_loss() also seems to perform the forward pass and loss computation, so unsure where the model is created.

rharwood · January 8, 2025, 11:20pm

That’s true. Here is my CustomLoss class:

class CustomLoss(nn.Module):
    def __init__(self, weight_1=1.0, weight_2=1.0):
        super(CustomLoss, self).__init__()
        # Weights to control the influence of each term
        self.weight_1 = weight_1
        self.weight_2 = weight_2
        # Define the loss functions
        self.mse_loss = nn.MSELoss()

    def forward(self, predictions, true_values, theoretical_values):
        # Loss Term 1: Error between predictions and true values (ground truth)
        loss_term_1 = self.mse_loss(predictions, true_values)
        
        # Loss Term 2: Error between predictions and theoretical values
        loss_term_2 = self.mse_loss(predictions, theoretical_values)
        
        # Combine the two loss terms
        total_loss = self.weight_1 * loss_term_1 + self.weight_2 * loss_term_2
        return total_loss

How should I go about creating a new model or resetting the parameters? Thank you!

ptrblck · January 9, 2025, 5:29pm

You could call the reset_parameters() method on all registered modules. Something like this should work:

for module in model.modules():
    if hasattr(module, "reset_parameters"):
        module.reset_parameters()

or you could simply recreate the model and optimizer:

regr = nn.Sequential(
    nn.Linear(6, 20),
    nn.ReLU(),
    nn.Linear(20,20),
    nn.ReLU(),
    nn.Linear(20,20),
    nn.ReLU(),
    nn.Linear(20,20),
    nn.ReLU(),
    nn.Linear(20,1)
).to(device)
optimizer = torch.optim.Adam(regr.parameters(), lr=0.001)

inside the hyperparamters loop.

rharwood · January 9, 2025, 5:41pm

The model is created here.

rharwood:

regr = nn.Sequential(
    nn.Linear(6, 20),
    nn.ReLU(),
    nn.Linear(20,20),
    nn.ReLU(),
    nn.Linear(20,20),
    nn.ReLU(),
    nn.Linear(20,20),
    nn.ReLU(),
    nn.Linear(20,1)
).to(device)

It seems like my main issue was creating my model outside of the hyperparameter loop.

Yes, this should fix my issues. Thank you, @ptrblck !