Combining ResNet50 and a MLP in a siamese structure

jorisvane · May 6, 2021, 6:18pm

Hi guys,
I started combining pre-trained ResNet50 and an MLP in a Siamese Structure and am running into a problem with dimensions. The idea is that two images with each a price go through the network and it returns a probability of chosing the first image. My images are now of size [224,224,3] as they should be for ResNet and I removed the last layer which should give me a vector of 2048 features. When I try to run the following code to test is the model has an output

for epoch in range(num_epochs):
    for i, (image1, image2, y_label, price1, price2) in enumerate(train_loader):
        
        # Get data to cuda

        image1 = image1.to(device=device)
        image2 = image2.to(device=device)
        y_label = y_label.to(device=device)
        price1 = price1.to(device=device)
        price2 = price2.to(device=device)
        
        #print(image1.shape)
        
        # Forward

        prob = model(image1, price1, image2, price2)
        print(prob)

I get the following error: RuntimeError: mat1 dim 1 must match mat2 dim 0
this is my model

model = models.resnet50(pretrained=True)

newmodel = torch.nn.Sequential(*(list(model.children())[:-1]))

pretrained = newmodel

class NN(nn.Module):

    def __init__(self, my_pretrained_model):
        super(NN, self).__init__()
        self.pretrained = my_pretrained_model
        
        self.MLP = nn.Sequential(
            nn.Linear(2048, 124),
            nn.ReLU(),                 
            nn.Linear(124, 1)
        )
        self.last_node = nn.Linear(2,1)
    
    
    def forward_once(self, x, y):
        x = self.pretrained(x)
        x = self.MLP(x)
        torch.cat((x,y))
        x = self.last_node(x,y)
        return x
    
    def forward(self, image1, price1, image2, price2):
        output1 = self.forward_once(image1, price1)
        output2 = self.forward_once(image2, price2)
        prob = 1/(1 + math.exp(output2-output1))    # probability of choosing first image
        return prob

Any help is appreciated!
Best regards

jorisvane · May 7, 2021, 1:53pm

The problem was that the second last layer from ResNet50 gave a 4d vector in the shape of [batchsize, 2048,1,1] so after squeezing with torch.squeeze(x) it had the right dimensions to go into the MLP.