Constraint Optimization

Hi!
I have two models 1) Athelete model 2) Referee model
Athelete has 17 inputs and 3 regression outputs. While referee has 20 inputs and 1 binary output. First 17 inputs of referee are similar to the inputs of athelete model while next 3 inputs are different combinations and 1 combination is similar to the 3 outputs of athelete model.
At first, both models are trained separately and then fine-tuning of athelete model is done by using referee model for constraint optimization.
The accuracy of both models is quite good but I am having issues with the fine-tuning of athelete model. After fine-tuning, it becomes totally inaccurate. MAPE is around 80% while before fine-tuning, MAPE is around 10%.
Here is the fine-tuning code:

X_master is the athelete model input and y_master is athelete model output.

similarly, slave means referee model.

class CustomLossLayer1(nn.Module):
def init(self, referee_model, scaler_mean, scaler_scale, weight_stability_loss=0.5):
super(CustomLossLayer1, self).init()
self.referee_model = referee_model
self.scaler_mean = torch.tensor(scaler_mean[17:], dtype=torch.float32)
self.scaler_scale = torch.tensor(scaler_scale[17:], dtype=torch.float32)
self.weight_stability_loss = weight_stability_loss

def forward(self, additional_inputs, y_true, y_pred):
    # Clamp and scale y_pred, as y_pred is in 0 to 0.8 range
    clamped_y_pred = torch.clamp(y_pred, 0.0, 0.8)
    scaled_y_pred = (clamped_y_pred - self.scaler_mean) / self.scaler_scale
    
    # Combine inputs for the referee model
    combined_inputs = torch.cat([additional_inputs, scaled_y_pred], dim=1)
    
    # Get stability prediction from referee model
    stability_prediction = self.referee_model(combined_inputs)
    
    # Calculate the mean squared error loss
    mse_loss = F.mse_loss(y_pred, y_true)
    
    # Calculate dynamic stability target
    stability_target = calculate_stability_target(y_true, y_pred)
    
    # Calculate the stability loss based on the dynamic target
    stability_loss = F.binary_cross_entropy(stability_prediction, stability_target.unsqueeze(1))
    
    # Adjust stability loss: if stability_target is 1, set stability_loss to 0
    adjusted_stability_loss = stability_loss * (1 - stability_target.unsqueeze(1))
    
    # Combine the MSE and adjusted stability loss
    total_loss = mse_loss + self.weight_stability_loss * adjusted_stability_loss.mean()
    
    return total_loss

EPOCHS = 200
LEARNING_RATE = 0.01

Initialize KFold

kf = KFold(n_splits=5, shuffle=True, random_state=42)

Lists to store performance metrics for each fold

mape_scores =
regression_losses =
training_losses =
validation_losses =

Cross-validation loop

for fold, (train_index, test_index) in enumerate(kf.split(X_master)):
print(f"Fold {fold+1}")

# Split the data into training and test sets for this fold
X_train, X_test = X_master_tensor[train_index], X_master_tensor[test_index]
y_train, y_test = y_master_tensor[train_index], y_master_tensor[test_index]

# Create DataLoaders for the athlete model
train_dataset = TensorDataset(X_train, y_train)
train_dataloader = DataLoader(train_dataset, batch_size=128, shuffle=True)

test_dataset = TensorDataset(X_test, y_test)
test_dataloader = DataLoader(test_dataset, batch_size=128, shuffle=False)

# Initialize the athlete model and custom loss function
athlete_model = AthleteModel()
custom_loss = CustomLossLayer1(referee_model, scaler_slave_mean, scaler_slave_scale)

# Define the optimizer
athlete_optimizer = optim.Adam(athlete_model.parameters(), lr=LEARNING_RATE)

# Train the athlete model with fine-tuning
fold_training_losses = []
fold_validation_losses = []
for epoch in range(EPOCHS):
    athlete_model.train()
    batch_losses = []
    for inputs, targets in train_dataloader:
        athlete_optimizer.zero_grad()
        athlete_output = athlete_model(inputs)
        loss = custom_loss(inputs, targets, athlete_output)
        loss.backward()
        athlete_optimizer.step()
        batch_losses.append(loss.item())
    fold_training_losses.append(np.mean(batch_losses))

In the CustomLossLayer, I am first combining the outputs of athelete model with the inputs of athelete model and then giving it as input the referee model which gives its binary output (1 being stable and 0 being unstable). Then the stability loss is calculated which is multiplied by a penalty factor and added to MSE loss to calculate final loss.

Kindly check the above code, specially the CustomLossLayer function and help me debugging it. Or if you have any other way to do so, then please do let me know.
I would be very thankful to you guys.