Training models in parallel and interactively

I need to train two models in parallel. Each model has a different activation function with trainable parameters. I want to train model one and model two in the way that the parameters of the activation function from model one (e.g., alpha1) is separated from the parameters in model two (e.g., alpha2) by a gap of 2; i.e., |alpha_1 - alpha_2| > 2. I wonder how I could include it into the loss function for training.