Implementation of the very simple siamese network

Hello. I’m trying to implement a siamese network with a contrastive loss.
It’s trained on raw tabular data. I’ve started with very a few columns to check if they’re equal (1) or not (0).
The model is:

class Model(nn.Module):
    def __init__(self, num_features, num_targets):
        super(Model, self).__init__()
        self.hidden_size = [5, 5, 5]
        self.dropout_value = [0.5, 0.35, 0.25]

        self.head = nn.Sequential(
            nn.BatchNorm1d(num_features),
            nn.Dropout(self.dropout_value[0]),
            nn.Linear(num_features, self.hidden_size[0]),
            nn.LeakyReLU(),

            nn.BatchNorm1d(self.hidden_size[0]),
            nn.Dropout(self.dropout_value[1]),
            nn.Linear(self.hidden_size[0], self.hidden_size[1]),
            nn.LeakyReLU(),

            nn.BatchNorm1d(self.hidden_size[1]),
            nn.Dropout(self.dropout_value[2]),
            nn.utils.weight_norm(nn.Linear(self.hidden_size[1], self.hidden_size[2]))
        )

    def forward(self, x1, x2):
        x1 = self.head(x1)
        x2 = self.head(x2)
        return x1, x2

And the loss is:

class ContrastiveLoss(nn.Module):
    def __init__(self, margin=1.):
        super(ContrastiveLoss, self).__init__()
        self.margin = margin
        self.eps = 1e-9

    def forward(self, output1, output2, target, size_average=True):
        distances = (output2 - output1).pow(2).sum(1)
        losses = 0.5 * (target.float() * distances + \
                        (1 + -1 * target).float() * F.relu(self.margin - (distances + self.eps).sqrt()).pow(2))
        return losses.mean() if size_average else losses.sum()

The problem is that it’s stuck at 0,13x training and 0,2xx validation loss. Using “distances”, it gives some weird results.
For the first 5 samples, instead of:

[1. 0. 0. 1. 1.]

It gives:

[0. , 0. , 0.38202894, 0. , 0.]

What could be the problem? Is this architecture appropriate? Or could it be some technical issues?