Index to matrix transform

Mohammadreza_Nazari · December 31, 2018, 8:31pm

I am wondering what is the most efficient way of converting a list of indices into a matrix. For example, I want to convert
ind : [[1,3],[1,2,3]]
to the form of
torch.tensor([[0,1,0,1],[0,1,1,1]])

Thanks

Hong · December 31, 2018, 10:18pm

You can do something like:

a = [[1,3],[1,2,3]]
tensors = []
for l in a:
    tensors.append(torch.zeros(4).scatter_(0, torch.tensor(l), 1))
result = torch.stack(tensors, 0)

Mohammadreza_Nazari · December 31, 2018, 11:13pm

Thanks, this works. I was wondering whether there is more efficient pytorch function which can do it without the for loop. It seems that not.

Konstantin_Burlachen · May 12, 2022, 4:40pm

I have created a code snippet when you may want to calculate modified Cross Entropy loss with sum reduction where you can weight samples, not classes.

It seems that 2-dim tensors can be indexed by a tuple of two elements - where first element is denoted for indicies of elements from first axis, and second list or iteratable element is denoted to index elements from second axis.

#!/usr/bin/env python3

import torch
import torch.nn as nn

# input.shape[0] -- number of samples
# input.shape[1] -- number of classes with linear scores for each class

loss = nn.CrossEntropyLoss(reduction="sum")
input = torch.randn(3, 5, requires_grad=True)
target = torch.empty(3, dtype=torch.long).random_(5)
output = loss(input[0:2,:], target[0:2])

e = input.exp()
eSum = torch.sum(e, dim = 1)
eSumRepeat = eSum.repeat((input.shape[1],1)).T
eSumNomralized = -torch.log(e/eSumRepeat)

mask = torch.zeros(eSumNomralized.shape) 
mask[range(input.shape[0]),target] = 1.0

weight=torch.Tensor([1.0,1.0,0.0])
weightRepeat = weight.repeat((input.shape[1],1)).T

loss = (eSumNomralized * mask * weightRepeat).sum()

print(output, loss)

p.s. Cross-Entropy in PyTorch terminology means usual Cross-Entropy CE(P,Q) with two extra assumptions:

Before computing CE the Q is not necessary p.m.f. but it’s arbitrarily scores that are converted into probability simples with typical symmetric logistic transformation.
And CE in PyTorch assume that P is discrete delta function, where in signal processing and Machine Learning it is denoted as one-hot vector.