Replacing a for loop with indexing

I have a 3-dimensional tensor. 10_000 examples x 10 predicted labels x outputs from 3 models. (10_000, 10, 3)

I also have a tensor with ids of the model outputs I would like to use for each label. It has 10 values between 0 and 2.

Is there any way I could index into the 3d tensor picking an output for a label from a specified model? At the end I would like to have a 10_000 examples x 10 labels tensor where for each of the labels I picked predictions from a model of my choosing.

I am currently doing this via permuting the original tensor to be of shape (10, 3, 10_000) and looping over the labels. I store the outputs in a list and concatenate them into a tensor at the end.

I tried using tensor.gather and tensor.index_select but couldn’t get either to work. Intuitively I feel there must a better way of doing this.

This is the code I have:

labels = []
for label, idx in zip(preds.permute(2, 0, 1), best_model_idx_per_label):


Would be grateful for any help. Thank you!

If I understand you correctly, you would like to select from the last dimension using an index tensor with values in [0, 2].
The same operation should be performed for all examples?

If so, you could try the following code:

a = torch.randn(20, 10, 3)

idx = torch.zeros(10).random_(3).long()
idx = idx.unsqueeze(0).repeat(20, 1)
b = a.gather(2, idx.view(20, 10, 1))

I’ve just used 20 for the batch dimension.
Let me know, if this works for you or if I misunderstood your question.

1 Like

Thank you very much for your help, really appreciate it :slight_smile: This works perfectly.

Really neat to see how you went about achieving this via growing the idx tensor to a shape that was needed and then using gather. Quite a learning experience!

Thank you!

1 Like

Hi @ptrblck

hawo can I replace this loop:
import torch
for i in range(d.size(0)):


@abd quite simple.

import torch
d = torch.randn(20, 10, 3)
d = d * 2 + 20

Hello @ptrblck

I have a efficient issue with some tensor for loop.

I’m extracting the features from the last layer of a CNN through a image data loader (I’m using batch size 8). Im getting the euclidean distance of the batch tensor and a table with previous features.

I want to add to the table a tensor every time all the tensors in the table are above a threshole. I have implemented a successful running code but the loop i use its not efficient and im wondering how i could do something similar using something more efficient rather than this secuential way.

for i, data in enumerate(dataloader, 0):
input, label = data
input, label =,
n,c h,w = input.size()
outputs = model(input)
if (i == 0):
features_list = (features_list, outputs[0].view(1,-1)), 0)
dist_tensores = torch.cdist(outputs, features_list, p=2.0)
activation =, AVG, out=torch.cuda.FloatTensor(len(outputs), len(features_list)))
counter = len(features_list)
activation_list = torch.sum(activation, dim=0)
for x in range(len(activation)):
if (torch.sum(activation[x], dim=0) == counter):
features_list = (features_list, outputs[x].view(1,-1)), 0)

Any ideas?

Could you post the shapes of all necessary tensors in order to create an executable code snippet, please?
PS: you can post code snippets by wrapping them into three backticks ```, which makes debugging easier :wink:

First of all thanks for anwering!

features_list = torch.empty( (0, 1000), dtype=torch.float ).cuda()

Firstly we create an empty tensor of 1000 colummns (output of VGG16 conv_layer5_3).

Batch size of the dataloader = 8

    counter = 0                                                     
    for i, data in enumerate(dataloader, 0):
        #Extraccion de Tensores
        input, label = data                                             
        input, label =,              
        n,c ,h,w = input.size()                                         
        outputs = model(input)   

this outputs torch tensor has dims: torch.Size([8, 1000])

        if (i == 0):                                                    
            features_list = (features_list, outputs[0].view(1,-1)), 0)

For the very first iteration we add the first tensor that comes in the mini-batch.

        dist_tensores = torch.cdist(outputs, features_list, p=2.0) 

This distance dims depends on the iteration but firstly are:
1º torch.Size([8, 1])
2º torch.Size([8, 6])
3º torch.Size([8, 11])

AVG = 60
        activation =, AVG, out=torch.cuda.FloatTensor(len(outputs), len(features_list)))

Depends on the iteration, always [8, X] being X the number of tensors in feature_list

        counter = len(features_list)
        activation_list = torch.sum(activation, dim=0)

        for x in range(len(activation)):
          if (torch.sum(activation[x], dim=0) == counter):
            features_list = (features_list, outputs[x].view(1,-1)), 0)

If the add of all the positions in dim 0 equals the number of rows of feature_list we proceed to add the tensor via

My issue is that i don`t know how to add the tensor outputs[x] if im not ussing a loop. I think there are “for loops” created for this or maybe its just a vectorial comparision.

Once again, thanks for your answer. :slight_smile:

@ptrblck Here is the code i hope that better this way :slight_smile:

You could probably avoid the last loop using:

idx = activation.sum(1) == counter
features_list =, outputs[idx]), 0)

However, as I don’t have dummy inputs and the expected results, I couldn’t really test it, so you would need to verify it.
I don’t think you could avoid the dist_tensor calculation, as it would be recalculated for the current output sample, if I understand the use case correctly.


@ptrblck That is exactly what i was looking for! Thanks for the help!