I need to find out the out group image among the set of three images (say, A, B, and C). Here, the anchor image is not know to me. So, I considered this problem as a classification problem where each class represents the out group image. Hence, for this case, there are three classes. My objective is to select the class based on the input set of three images. I am providing an image for better idea about the image I am considering here.
I am applying a atrous network from DeepLab_v3 as the siamese network. The derived features of each images are then concatenate and feed to a linear layers (classifier) which finally gives three outputs. The network is as follows,
import math import torch import torch.nn as nn import torch.nn.functional as F from torchvision import models class EmbeddingNet(nn.Module): def __init__(self): super(EmbeddingNet, self).__init__() deeplab = models.segmentation.deeplabv3_resnet50(pretrained=False) print('deeplab:',deeplab) features_classifier = list(deeplab.backbone.children())[:4] features_classifier.extend(nn.Sequential(nn.Conv2d(in_channels=64, out_channels=2048, kernel_size=1, stride=1, bias=False), nn.BatchNorm2d(2048),nn.ReLU(inplace=True))) features_classifier.extend(list(deeplab.classifier.children())[:-1]) self.network = nn.Sequential(*features_classifier) def forward(self, x): out = self.network(x) return out class TripletNet(nn.Module): def __init__(self, embedding_net): super(TripletNet, self).__init__() self.embedding_net = embedding_net self.fc1 = nn.Linear(738048, 256, bias=True) self.bn1 = nn.BatchNorm1d(256) self.relu1 = nn.ReLU(inplace=True) self.fc2 = nn.Linear(256, 3, bias=True) self.bn2 = nn.BatchNorm1d(3) self.relu2 = nn.ReLU(inplace=True) def forward(self, x1, x2, x3): output1 = self.embedding_net(x1) output1 = output1.view(output1.size(0), -1) output2 = self.embedding_net(x2) output2 = output2.view(output2.size(0), -1) output3 = self.embedding_net(x3) output3 = output3.view(output3.size(0), -1) output = torch.cat((output1,output2,output3),1) output = self.fc1(output) output = self.bn1(output) output = self.relu1(output) output = self.fc2(output) output = self.bn2(output) output = self.relu2(output) return F.log_softmax(output, dim=1) def get_embedding(self, x): return self.embedding_net(x)
Finally, I use Cross Entropy as the loss function.
Problem: The changes of the loss value as well as accuracy are negligible.
Would any one suggest me what to do?