I’m trying to retrain siamese network with contrastive loss - I’ve pretrained the net for classification and then replaced classification fc layer with new fc layer of size 512.
However, the net seems not to learn at all. I suspect that this is caused by the margin in contrastive loss. Here I’ve learned that If I’ll L2 normalize output features I can set a constant margin and forget about training it. But it seems not to work. I’ve written it like this:
margin = 2 label_batch = (class_labels_1 != class_labels_2).to(device).float() # 0: similar pair, 1: different pair output1 = net(img_batch_1) output2 = net(img_batch_2) o1 = F.normalize(output1, p=2, dim=1) o2 = F.normalize(output2, p=2, dim=1) euclidean_distance = F.pairwise_distance(output1, output2) loss_contrastive = torch.mean((1 - label_batch) * torch.pow(euclidean_distance, 2) + label_batch * torch.pow(torch.clamp(margin - euclidean_distance, min=0.0), 2)) loss_contrastive.backward() optimizer.step()
Does it look OK or should I set margin otherwise?
By “the net seems not to learn at all” I mean that the training set loss very quickly drops (from 21 to 0.8) but the test set loss doesn’t change. For the training set mean values of distances between dissimilar pairs drops from ~2.2 to ~1.2. Why does this drops? Shouldn’t it go somewhere above 2.0 (above margin)?