Using multi-dimensional labels for training

I have Tensors that represent similarity between digits in MNIST (I computed these already by creating embeddings). E.g.

[tensor([0.1894, 0.0839, 0.0679, 0.0888, 0.1104, 0.1004, 0.0859, 0.0706, 0.0864,
         0.1163]),
 tensor([0.0774, 0.1747, 0.0541, 0.0795, 0.0907, 0.1100, 0.0798, 0.1180, 0.1238,
         0.0922]),
 tensor([0.0717, 0.0619, 0.2002, 0.1048, 0.0822, 0.0932, 0.1339, 0.0686, 0.0929,
         0.0906]),
 tensor([0.0856, 0.0831, 0.0956, 0.1826, 0.0759, 0.1335, 0.1026, 0.0836, 0.0900,
         0.0674]),
 tensor([0.1050, 0.0936, 0.0740, 0.0749, 0.1802, 0.0929, 0.0692, 0.0887, 0.0852,
         0.1362]),
 tensor([0.0869, 0.1032, 0.0763, 0.1198, 0.0845, 0.1639, 0.0616, 0.0975, 0.1070,
         0.0992]),
 tensor([0.0825, 0.0831, 0.1217, 0.1022, 0.0698, 0.0684, 0.1819, 0.1017, 0.1140,
         0.0747]),
 tensor([0.0680, 0.1234, 0.0626, 0.0836, 0.0899, 0.1086, 0.1021, 0.1827, 0.0974,
         0.0818]),
 tensor([0.0761, 0.1182, 0.0774, 0.0822, 0.0789, 0.1088, 0.1045, 0.0889, 0.1667,
         0.0984]),
 tensor([0.1063, 0.0914, 0.0784, 0.0639, 0.1309, 0.1048, 0.0712, 0.0776, 0.1022,
         0.1732])]

In the above, as an example, the first tensor corresponds to the class ‘0’ and each value represents a measure of similarity. These tensors are normalised (so sum of elements is 1.0).

What I would like to do is to use these as labels to train a different network. So it is label smoothing, but not with a uniform distribution. The idea is that there is additional info in the labels regarding similarity between classes.

I have tried to find some way to manually assign these labels. It seems that pytorch just wants a single value for a label. Is there any way to use these labels to train with?

import torch
import torch.nn.functional as F


def SoftCrossEntropy(inputs, target, reduction='sum'):
    log_likelihood = -F.log_softmax(inputs, dim=1)
    batch = inputs.shape[0]
    if reduction == 'average':
        loss = torch.sum(torch.mul(log_likelihood, target)) / batch
    else:
        loss = torch.sum(torch.mul(log_likelihood, target))
    return loss

Thanks. I have tried to set the labels in the loss forward method, but this did not work - any idea how to do it please? My code is below:

class SoftCrossEntropy(nn.Module):
    def __init__(self, labels, reduction='sum'):
        super().__init__()
        self.labels = labels
        self.reduction = reduction
    
    def forward(self, preds, target):
        target = labels[target]
        log_likelihood = -F.log_softmax(preds, dim=1)
        batchsize = preds.shape[0]
        if self.reduction == 'mean':
            loss = tc.sum(tc.mul(log_likelihood, target)) / batchsize
        else:
            loss = tc.sum(tc.mul(log_likelihood, target))
        return loss

class ResNetMNIST_ls(pl.LightningModule):
    def __init__(self):
        super().__init__()
        self.model = models.resnet18(num_classes=10)
        self.model.conv1 = nn.Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2),
                                     padding=(3, 3), bias=False)
        self.loss = SoftCrossEntropy(labels)

    @auto_move_data
    def forward(self, x):
        return self.model(x)
  
    def training_step(self, batch, batch_no):
        x, y = batch
        logits = self(x)
        loss = self.loss(logits, y)
        return loss
    
    def validation_step(self, batch, batch_no):
        x, y = batch
        logits = self(x)
        loss = self.loss(logits, y)
  
    def configure_optimizers(self):
        return tc.optim.Adam(self.parameters(), lr=1e-3)

To give an example of what I need, say I have the following labels as target:

tensor([2, 8, 2, 6])

I would like to replace each label with the corresponding row of the embeddings I have. I think this means ending up with a 2D tensor like:

tensor(
[[0.0902, 0.0642, 0.1733, 0.1177, 0.0837, 0.0939, 0.0653, 0.1129, 0.0894, 0.1095],
 [0.1019, 0.1095, 0.0896, 0.0903, 0.0637, 0.0944, 0.0465, 0.1101, 0.1737, 0.1204],
 [0.0902, 0.0642, 0.1733, 0.1177, 0.0837, 0.0939, 0.0653, 0.1129, 0.0894, 0.1095],
 [0.1262, 0.1171, 0.0811, 0.0780, 0.1252, 0.0888, 0.2153, 0.0586, 0.0576, 0.0521]]
)

This is how I managed to do it, unfortunately having to convert to numpy arrays in a couple of places:

# y.shape -> torch.Size([32])
# tlabels.shape -> torch.Size([10, 10])

y = tc.Tensor(np.array([np.array(tlabels[lbl]) for lbl in y])).cuda()

# y.shape -> torch.Size([32, 10])

How could this be done using just torch tensor manipulation?