Custom Loss not Converging

I am trying to train a Network to predict Writer Independent Signature Verification.
Here by, writer independent I mean that we train the model on some signature datasets and create a vector embedding at last layer, and the model will be able to create the signatures of different users at different points in the higher dimension space, so during inference we can find the distance between the inference image and the vector embedding of the stored images.

So for this we can use triplet loss while training, and after training create embedding of the users and store.

But in my application, we will get an image, and also the user id of the image, so we can get the stored image for that user id and have to match that the incoming image and the reference image in our database is similar or not.

For that I create a Custom Loss function, but it’s not converging.

class Loss(nn.Module):
    def __init__(self, p_margin, n_margin, swap=True):
        super(Loss, self).__init__()
        self.p_margin = p_margin
        self.n_margin = n_margin
        self.swap = swap

    def forward(self, anchor, positive, negative):
        pos_dis = torch.norm(anchor-positive, 2, 1)
        neg_dis = torch.norm(anchor-negative, 2, 1)
        pn_dis = torch.norm(positive-negative, 2, 1)

        if self.swap:
            neg_dis = torch.maximum(pn_dis, neg_dis)

        p_loss = pos_dis-self.p_margin
        p_loss = torch.maximum(p_loss, torch.zeros_like(p_loss))

        n_loss = self.n_margin-neg_dis
        n_loss = torch.maximum(n_loss, torch.zeros_like(n_loss))
        return p_loss+n_loss

Here I am taking an anchor image, positive image which is of same class, and a negative image. And the goal of the function is to create the Norm distance between same classes(anchor and positive) to less than p_margin and the distance between different classes(anchor and negative) to be greater than n_margin.

But my loss isn’t converging at all even after 270 epochs

Any leads onto what may be wrong in the Loss function, maybe having hard value of p_margin and n_margin is not the right way.