During my CNN training, I have:
-
My image feature map with shape of [10, 256, 64, 64], where 10 is batch size and 256 is my channel, both 64’s are height and width of feature map size.
-
Another tensor(points tensor) with shape of [10, 64, 2], which are basically stakes of 2d coordinates. So for each sample, I have 64 points with x and y coordinates.
My question is how to select the values on my feature map tensor based on the coordinate in my points tensor?
The output should ideally be in shape of [10, 64, 256], since for each point, I will have values from 256 channels.
How could I do that without running in a python loop?
My current approach is in below:
results = torch.zeros( 10, 64, 256)
feature_map = feature_map.transpose(1,2).transpose(2,3) # transpose feature map to shape of [10, 64, 64, 256]
for i in range(feature_map.shape[0]): # select i-th sample
for j in range(points.shape[1]): # select j-th point for current sample
value = feature_map[i][points[i][j][1]][points[i][j][0]] #get value at a certain coordinate
results[i][j] = value
I am wondering if there will be a more efficient way of doing this kind of indexing. Any help will be appreciated.