Average of tensors

I am using pretrained VGG16 model to get features of an image. Image dimension is 256X340 however, VGG16 takes image of dimension 244X244, so I made 5 images out of original image, each of dimension 244X244 (topleft, topright, bottomleft, bottomright, center). Now I am passing these 5 images to model, dimension [5, 3, 244, 244] and getting output of [5, 4096] (modified last layer of VGG16 to get 4096 feats). Now I want to take the average of these 5 outputs so that I can get a final set of feature for original image. Any suggestion on how to do this ?

Does doing mean on axis = 0 can be a solution here. ??

Hi,

Yes it is :slight_smile:
It will give you one set of features for the set of images.

1 Like

Hello @albanD. I have something similar. My data class returns the image, label and ID. How would I average over the ID. each ID could have 1:N pictures in the models. Code so far:

class SuperEncoder(nn.Module):
def init(self):
super(MyModel, self).init()
self.roofEncoder = nn.Sequential(
nn.Conv2d(3, 6, 3, 1, 1),
nn.ReLU(),
nn.MaxPool2d(2),
nn.Conv2d(6, 12, 3, 1, 1),
nn.ReLU(),
nn.MaxPool2d(2)
)

    self.dwellingEconder = nn.Sequential(
        nn.Conv2d(1, 6, 3, 1, 1),
        nn.ReLU(),
        nn.MaxPool2d(2),
        nn.Conv2d(6, 12, 3, 1, 1),
        nn.ReLU(),
        nn.MaxPool2d(2)
    )
    
    self.fc1 = nn.Linear(54*54*16, 1000)
    
    self.fc2 = nn.Linear(54*54*16, 1000)
    
    
    
    self.fc_out(x)
        
    
def forward(self, x1, x2):
    x1 = self.roofEncoder(x1)
    x1 = x1.view(x1.size(0), -1)
    x1 = F.relu(self.fc1(x1))
    
    
    x2 = self.dwellingEncoder(x2)
    x2 = x2.view(x2.size(0), -1)
    x2 = F.relu(self.fc2(x2))

For more context see: How do I average photo feature outputs for later concatenation?