Train two models in sequence simulteneously

I am trying to train a embedder. So, I have an architecture for the model to embed texts. And I have another model architecture that will take the inputs from the output of the first model and predict the label.
Now at the end of the training I want the first model to be saved as a pre-trained embedding model.
So, the pseudo code is as follows -
input: text
… some BiLSTM , Dense layers…
output: n-dimensional embedding

input: embedding1, embedding2
find the similarity of texts
run a linear layer + sigmoid
output: similarity score

Now I want to use a dataset with the following structure - (text1, text2, similarity). I want to train the models and save Model1 as a pretrained model to use later.

How can I train these two with PyTorch? How shall I write the optimizer here?