(Face recognition) Problem fine-tuning arcface with triplet loss


I am trying to do fine-tuning of the last layer of the arcface model of facial recognition, for this leaving frozen all the weights of the network, except those belonging to the last layers. The problem is that the loss does not decrease, instead if I do not leave any frozen weight the model learns in a good way, however, I want to use the knowledge learned by the network to be trained with arcface, so I want to keep the weights frozen.
I would like to know if someone understands what happens, I have the impression that there is no compatibility between the embeddings, since arcface is trained with a classification logic and not with a contrastive logic as in triplet loss, but it does not make sense to me since in both cases the embeddings of the same person are close in the hyperplane, so there should not be problems.
*I have the same problem with any model trained with classification losses, adaface, shpereface, etc. but not with contrastive models like facenet.