Torch.view() dimensions constraint

alex_gilabert · July 5, 2020, 10:56pm

Hello

I am doing some research on CNN and i want to extract manually features through a function.

The problem i am facing is that i want to change the dimension of the output tensor of an image batch to treat it in a more lineal way.

I am using VGG16 and AlexNet. In VGG16 in the Conv_5_3 we receive a matrix with dimensions (10,512,14,14). I want to keep the first dimension for the number of images that are in the minibatch and want a result like (10, 100352). For this i have no issue.

If i want to go to Conv4_3 for example, the matrix that i want to convert is (10, 512, 28 28) what would make it a matrix (10, 4014080) but i get an error message like this “shape ‘[10, 4014080]’ is invalid for input of size 4014080”.

I am pretty sure that its because it goes over the limit of dimensions. But i can not asure this and have no clue what is really going on.

If this is the problem, anyone knows a way that i can change this matrix dimensions for the code below???

net = models.alexnet(pretrained=True)
num_features = net.classifier[6].in_features      #4096
features = list(net.classifier.children())[:-2] # Remove last layer
print(features)
net_.classifier = nn.Sequential(*features) # Replace the model classifier
modified_pretrained = nn.Sequential(*list(net.features.children())[:-2]) # to relu5_3 (:-2)-------relu4_3 (:-9)

def feature_extr_cdist_train (model, dataloader):
    features_list = torch.empty( (0, 432640), dtype=torch.float ).cuda()  #Habrá que cambiarlo en funcion de las características #1000 #4096 #100352 #4014080   AlexNet 432640
    image_album = torch.empty( [1, 3, 224, 224], dtype=torch.float ).cuda()
    counter = 0  
    model.cuda()
    for i, data in enumerate(dataloader, 0):
        input, label = data                                             
        input, label = input.to(device), label.to(device)               
        n,c ,h,w = input.size()
        outputs = model(input)
        outputs = outputs.view(10, 432640)   #8028160
        

        if (i == 0):                                                    
            features_list = torch.cat( (features_list, outputs[0].view(1,432640)), 0) #view(1,-1)
            image_album[0] = input[0]
        
        dist_tensores = torch.cdist(outputs, features_list, p=2.0) 
        activation = torch.gt(dist_tensores, AVG, out=torch.cuda.FloatTensor(len(outputs), len(features_list)))
        counter = len(features_list)
        idx = activation.sum(1) == counter
        features_list = torch.cat((features_list, outputs[idx]), 0)
        image_album = torch.cat((image_album, input[idx,:,:,:]), 0)

    return features_list, image_album

My issue is in the part i associate the input with the output of the CNN and the conversion with view().

Thanks in advance

Nikronic · July 6, 2020, 7:03am

Hi,

Could you add stack trace of error? The dimensions you have mentioned in your explanation do not exist in your code so I cannot find where it happens.

But I would like to mention that

512*28*28 = 401408 and 10*512*28*28 = 4014080 (note the zero at the end). So the error says that you cannot reshape 4014080 into [10, 4014080].

Bests

alex_gilabert · July 6, 2020, 10:49am

Yes sure! Tou were right, there was a mistake writing the issue. Hope this helps you more @Nikronic !

Output before redimension torch.Size([10, 256, 13, 13])
RuntimeError Traceback (most recent call last)
in ()
2 for param in modified_pretrained.parameters():
3 param.requires_grad = False
----> 4 (tensor_train, image_album) = feature_extr_cdist_train(modified_pretrained, load_train)

in feature_extr_cdist_train(model, dataloader)
14 print(“Output before redimension”,outputs.size())
15 #print(“antes”, outputs.size())
—> 16 outputs = outputs.view(10, 432640) #8028160
17
18 #print(outputs.size())

RuntimeError: shape ‘[10, 432640]’ is invalid for input of size 432640

Nikronic · July 6, 2020, 1:16pm

I think you have same issue hero.
Are your feature maps in size [10, 208, 208]? In this case again you have multiplied by 10 so you need to use

outputs = outputs.view(10, 43264) #8028160

instead of: