I write a BiLSTM-Siamese Network to measure the string similarities using pairwise distance and cosine similarities with the detail as follows:
class SiameseNetwork(nn.Module):
def __init__(self, num_layers, dropout, weight_matrix, vocabs, similarity_measure):
super(SiameseNetwork, self).__init__()
self.lstm_network = BiLSTM(num_layers, weight_matrix, vocabs)
self.fc_drop = nn.Dropout(p = dropout)
self.similarity_measure = similarity_measure
if self.similarity_measure == 'euclidean_distance':
self.sm = nn.PairwiseDistance(p=2)
else:
self.sm = nn.functional.cosine_similarity
def forward(self, input1, input2):
output1 = self.lstm_network(input1)
output2 = self.lstm_network(input2)
out1 = self.fc_drop(output1)
out2 = self.fc_drop(output2)
x = self.sm(out1, out2)
if self.similarity_measure == 'euclidean_distance':
x = 1-x # The larger the x value is, the more similar the strings are.
x = torch.sigmoid(x)
return x
I used the torch.sigmoid to make the similarity degree between 0 and 1. However, the sigmoid makes the same string pair’s similarities, not 1. Hence, I need to know how to make the range of the similarity degree in the range 0-1 using the pairwise distance and cosine similarity. 0 if the string pairs are dissimilar and 1 if the string pairs are similar. Any help would be greatly appreciated. Thank you!