I am doing some research on CNN and i want to extract manually features through a function.
The problem i am facing is that i want to change the dimension of the output tensor of an image batch to treat it in a more lineal way.
I am using VGG16 and AlexNet. In VGG16 in the Conv_5_3 we receive a matrix with dimensions (10,512,14,14). I want to keep the first dimension for the number of images that are in the minibatch and want a result like (10, 100352). For this i have no issue.
If i want to go to Conv4_3 for example, the matrix that i want to convert is (10, 512, 28 28) what would make it a matrix (10, 4014080) but i get an error message like this “shape ‘[10, 4014080]’ is invalid for input of size 4014080”.
I am pretty sure that its because it goes over the limit of dimensions. But i can not asure this and have no clue what is really going on.
If this is the problem, anyone knows a way that i can change this matrix dimensions for the code below???
net = models.alexnet(pretrained=True) num_features = net.classifier.in_features #4096 features = list(net.classifier.children())[:-2] # Remove last layer print(features) net_.classifier = nn.Sequential(*features) # Replace the model classifier modified_pretrained = nn.Sequential(*list(net.features.children())[:-2]) # to relu5_3 (:-2)-------relu4_3 (:-9) def feature_extr_cdist_train (model, dataloader): features_list = torch.empty( (0, 432640), dtype=torch.float ).cuda() #Habrá que cambiarlo en funcion de las características #1000 #4096 #100352 #4014080 AlexNet 432640 image_album = torch.empty( [1, 3, 224, 224], dtype=torch.float ).cuda() counter = 0 model.cuda() for i, data in enumerate(dataloader, 0): input, label = data input, label = input.to(device), label.to(device) n,c ,h,w = input.size() outputs = model(input) outputs = outputs.view(10, 432640) #8028160 if (i == 0): features_list = torch.cat( (features_list, outputs.view(1,432640)), 0) #view(1,-1) image_album = input dist_tensores = torch.cdist(outputs, features_list, p=2.0) activation = torch.gt(dist_tensores, AVG, out=torch.cuda.FloatTensor(len(outputs), len(features_list))) counter = len(features_list) idx = activation.sum(1) == counter features_list = torch.cat((features_list, outputs[idx]), 0) image_album = torch.cat((image_album, input[idx,:,:,:]), 0) return features_list, image_album
My issue is in the part i associate the input with the output of the CNN and the conversion with view().
Thanks in advance