I was testing the cosine similarity of ResNet (not trained) and other models when I noticed that I would always get 1.0 for any pair of pictures.
I have some example code which demonstrates this:
img0 = Image.open("train/006_000.png") img1 = Image.open("train/006_001.png") tf_temp =transforms.Compose([transforms.Resize((256,256)), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0.225]), ]) img0 = tf_temp(img0).unsqueeze(0) img1 = tf_temp(img1).unsqueeze(0) batch = torch.cat([img0,img1],0) embedding_temp = net.embedding(batch) embedding_temp_1 = net.embedding(img0) embedding_temp_2 = net.embedding(img1) print(F.cosine_similarity(embedding_temp, embedding_temp, dim=0).item()) print(F.cosine_similarity(embedding_temp_1, embedding_temp_2, dim=1).item())
Note: net.embedding is the resnet.
I expected the results to be the same since I only changed how many tensors I processed at a time. The first forward pass batches both tensors, but the second and third treat img0 and img1 separately. If I use the separated variant, I always get near 1.0 values. Why?
Edit: When I set net.embedding.eval(), both numbers are the same. However, I still get values near 1.0 all time time. I am still unsure why this is still the case.