Angular Features

Hi Ptrblck,
Can you please implement the highlighted equations in PyTorch by extracting the features from the CIFAR dataset using ResNet 32 architecture? I have been trying it for the last couple of days but unable to do that.
I shall be very thankful to you for this favor.


Could you post your current code snippets and explain where you are currently stuck, please?

1 Like

Dear ptrblck,
Thanks for the reply. In the following code snippet, features are being extracted at line no 386. I want to find the angles between these features and their corresponding class centers. I am stuck on how to find the classes, their centers, and angles between class centers and corresponding features. Although it is mentioned in the paper (sent before), however, I can not implement it.

Based on the features output of the model you could start by calculating the class centers.
I don’t see a clear description how these centers are calculated so I assume they are just the mean of the features corresponding to the current class?
If so, you could use the target to index features and thus split it into different class features.
Once this is done, you could then apply the mean to generate the class centers.

1 Like

Hi ptrblck,
Thanks a lot for your reply, however, it all went over my head, could you please help with that?

Assuming my description is right, you could use this simple example to use the target for the current batch and index the output with it in order to append the features class-based:

nb_classes = 3
batch_size = 10
features = 4
output = torch.randn(batch_size, features)
targets = torch.randint(0, nb_classes, (batch_size,))

> tensor([0, 0, 0, 2, 0, 1, 1, 2, 2, 2])

class_features = {idx: [] for idx in range(nb_classes)}
for class_index in range(nb_classes):
    idx = targets == class_index
    class_feat = output[idx]

> {0: [tensor([-0.8338,  0.3141, -0.2840,  0.6104]), tensor([-1.5458, -0.3546, -0.3190,  0.6153]), tensor([-1.3919, -0.4112,  0.4425,  0.8475]), tensor([ 0.3294, -0.2577,  0.3397,  0.4239])], 1: [tensor([-1.4398,  0.2516,  1.6932, -0.0364]), tensor([-0.6315, -0.6568,  0.7358,  0.5755])], 2: [tensor([ 1.3726, -0.7952,  2.1696,  0.6634]), tensor([ 0.4024, -2.2702,  2.2658,  2.6325]), tensor([ 0.2029, -1.4608, -0.2938,  1.0877]), tensor([-1.2883, -0.5849,  0.2535, -0.0638])]}

In the next step you could then use another loop to calculate the mean of these features:

class_features_mean = {idx: [] for idx in range(nb_classes)}
for class_index in range(nb_classes):
    tmp = torch.stack(class_features[class_index])
    class_features_mean[class_index] = tmp.mean(dim=0)

> {0: tensor([-0.8606, -0.1773,  0.0448,  0.6243]), 1: tensor([-1.0357, -0.2026,  1.2145,  0.2696]), 2: tensor([ 0.1724, -1.2778,  1.0988,  1.0799])}

Note that you could of course speed up this code, but I would recommend to stick to loops for now until you are sure the method works as intended.

1 Like

Thank you very much ptrblck, I will try to incorporate it in my work.

Heaps, heaps of thanks and stay blessed.


Hi, ptrblck,
I tried to incorporate that logic into my work, however, in my case, the dimension of features is the Batch size(128),2,128 instead of 4 (as mentioned in your example). I think due to this it is causing problems. Could you please have a look at the code and corresponding output as shown in the following pics
Code Snippet


What is the shape of targets?
PS: you can post code snippets by wrapping them into three backticks ```, which makes debugging easier. :wink:

1 Like

Hi ptrblck,
By targets, do you mean the total number of classes or train targets. I am using Cifar10 dataset then the targets size would be 10.

In that case could compare your current tensors including their shape to my code snippet, as it’s working? If you get stuck, could you post an executable code snippet to reproduce this issue?

1 Like

Thanx ptrblck for your reply. I am really stuck after spending a couple of days on this problem.
What I want to do:
I want to find the average angular variance of head classes( say classes with samples greater than 500 are head classes, others are tail classes) and then transform it to tail classes. I think the solution proposed by you in the last post is fine. So initially I want to extract the features from all classes then associate those features to the corresponding classes and then find the angles between features and corresponding class center. The solution that you provided also makes sense to me, however, in my case feature size is 128,2,128 instead of 4. Could you please, help with this, I shall be very thankful to you.

The corresponding code -snippet is given below.

def train(train_loader, model, criterion, optimizer, epoch, opt, train_targets):

“”“one epoch training”""


batch_time = AverageMeter()
data_time = AverageMeter()
losses = AverageMeter()

end = time.time()
for idx, (images, labels) in enumerate(train_loader):
    data_time.update(time.time() - end)

    images =[images[0].unsqueeze(1), images[1].unsqueeze(1)],
    images = images.view(-1, 3, 32, 32).cuda(non_blocking=True)
   # print('..............Image Size...........',images.shape)
    labels = labels.cuda(non_blocking=True)
   # print('..............Label Size...........',labels.shape)
    bsz = labels.shape[0]
    # compute loss
    features = model(images)
    #print('....................Size of Features (Before)............',features.shape)
    features = features.view(bsz, 2, -1)     # feature size 128,2,128
    # For Angular variance 
    if opt.dataset == 'cifar10':
        print('Dataset :  Cifar10')
    elif opt.dataset == 'cifar100':
        print('Dataset : Cifar100')
    # bsz: batch_size
    #output = torch.randn(batch_size, features)
    targets = torch.randint(0, nbclasses, (bsz,)).cuda(non_blocking=True)
    class_features = {idx: [features] for idx in range(nbclasses)}
    for class_index in range(nbclasses):
        idx = targets == class_index
        class_feat = output[idx]
    loss = criterion(class_features, labels)

What do these dimensions mean? E.g. is one of them the batch dimensions and others feature dimensions? What would a size of 4 for dim1 mean and why are you currently calling features.view(bsz, 2, -1)?

As already said, I’m not familiar with this paper, but can assist in debugging some code, if you can explain what you are trying to each achieve and what is currently not working. :wink:

1 Like

Dear ptrblck,
Thanks for your reply, I have managed to incorporate your code and logic into mine and have executed it successfully with feature dimension of 128,2,128. However, I still can’t understand the logic of calculating the mean values, i.e according to my understanding mean value should be a single value but it displays the following output. It displays such 10 tensors for mean values.
Class Feature Mean…: {0: tensor([[ 1.8197e-01, 2.5143e-02, -1.0864e-04, -7.5293e-02, -6.2245e-02,
6.6454e-02, 2.2286e-02, -3.5217e-02, 5.2823e-02, 2.7451e-02,
8.6874e-02, 5.1490e-02, -6.2997e-02, -6.6158e-02, -9.9687e-02,
9.7800e-02, 5.1059e-02, 5.1787e-02, 2.3591e-02, 2.5489e-02,
2.0475e-01, 6.9972e-02, -5.2651e-02, 6.5951e-02, 4.6435e-02,
-7.2646e-02, -2.6367e-02, 5.7257e-02, -6.5422e-03, -2.4042e-02,
5.7473e-02, 6.9909e-02, -1.8475e-01, 4.5108e-02, -1.8931e-02,
1.1304e-01, -1.4951e-01, 5.7364e-02, 5.5015e-02, 3.4561e-02,
-7.1705e-02, -5.5109e-02, -6.4743e-02, -1.4122e-01, 1.0535e-01,
-4.5873e-02, 2.1184e-02, -5.6603e-02, 2.8610e-02, 1.4033e-01,
5.9965e-03, -4.0568e-03, 1.5643e-02, -1.5262e-01, -1.7967e-02,
1.3538e-01, -8.6360e-02, 1.5044e-01, 6.4374e-02, -1.8130e-01,
-4.8363e-02, -2.3257e-02, -1.8913e-02, -4.0417e-02, -6.0418e-03,
1.0047e-02, 6.2920e-02, -1.4348e-02, -4.8396e-02, 1.4185e-01,
1.3855e-01, -4.7415e-02, 1.1054e-01, -2.4817e-03, 5.2295e-02,
-2.3096e-02, 1.8722e-01, 1.0053e-01, 7.2813e-02, 2.9129e-02,
1.3816e-01, -6.6317e-02, -1.3940e-01, 2.5105e-02, -4.2261e-02,
1.8815e-02, 4.5387e-02, -3.9552e-02, -7.7882e-02, -6.2132e-02,
2.9124e-02, 3.7498e-03, -8.9862e-02, -4.3700e-02, 7.1030e-02,
6.5240e-02, 3.0205e-02, 2.6702e-03, 3.6049e-02, 9.3761e-02,
3.6270e-02, -1.1766e-01, -1.6088e-01, 4.6290e-02, -8.2123e-02,
2.4309e-03, 1.1423e-01, -5.4024e-02, 1.7386e-02, -1.0836e-02,
1.7031e-01, 3.2585e-02, 1.1521e-01, 3.0698e-02, 9.9960e-02,
-1.7035e-02, -7.7930e-02, 1.8263e-01, 1.0334e-01, 1.8118e-01,
4.8566e-02, -7.1670e-02, -1.2270e-01, -1.3332e-01, -1.5380e-02,
-1.5914e-01, -1.0167e-01, 1.1124e-01],
[ 1.8200e-01, 3.9448e-02, 3.7286e-03, -6.7564e-02, -4.8194e-02,
6.0452e-02, 2.4397e-02, -2.2919e-02, 5.0559e-02, 4.2893e-02,
8.5817e-02, 5.7405e-02, -7.3969e-02, -7.0352e-02, -9.5953e-02,
9.1239e-02, 4.4439e-02, 4.9519e-02, 2.0429e-02, 2.1237e-02,
1.9860e-01, 6.6372e-02, -3.8266e-02, 6.7373e-02, 5.0183e-02,
-7.4488e-02, -1.9873e-02, 6.0643e-02, -1.4473e-02, -2.5556e-02,
6.8193e-02, 6.7811e-02, -1.7251e-01, 5.6405e-02, -1.1552e-02,
1.1177e-01, -1.4240e-01, 5.5175e-02, 5.4221e-02, 3.0813e-02,
-6.9005e-02, -5.7479e-02, -5.5988e-02, -1.4929e-01, 1.0589e-01,
-6.4639e-02, 7.6206e-03, -7.1328e-02, 2.3568e-02, 1.5242e-01,
6.8193e-03, 1.3356e-03, 1.5267e-02, -1.5220e-01, -2.4627e-02,
1.2660e-01, -9.6360e-02, 1.3858e-01, 5.2747e-02, -1.8939e-01,
-5.4749e-02, -2.7353e-02, -1.8591e-02, -5.3250e-02, 4.5380e-03,
5.6123e-03, 6.6939e-02, -1.6745e-02, -6.5302e-02, 1.2999e-01,
1.4105e-01, -4.6737e-02, 9.6890e-02, -8.0185e-03, 5.5306e-02,
-2.4646e-02, 1.8729e-01, 1.0327e-01, 5.8818e-02, 2.0035e-02,
1.3873e-01, -7.6202e-02, -1.4655e-01, 3.3630e-02, -6.1461e-02,
2.0106e-02, 3.6875e-02, -5.0162e-02, -8.8452e-02, -6.1024e-02,
4.8275e-02, 1.8975e-02, -9.7968e-02, -3.9750e-02, 6.8793e-02,
5.6112e-02, 3.6727e-02, 4.3991e-03, 2.7574e-02, 9.6684e-02,
4.8077e-02, -1.3928e-01, -1.6517e-01, 4.6624e-02, -8.2189e-02,
-6.7808e-03, 9.8721e-02, -6.0551e-02, 1.1518e-02, -1.7801e-02,
1.7948e-01, 2.1270e-02, 1.1515e-01, 4.2319e-02, 8.7374e-02,
-3.0091e-02, -9.1005e-02, 1.8670e-01, 1.0329e-01, 1.9839e-01,
5.2998e-02, -7.2922e-02, -1.3568e-01, -1.1318e-01, -2.0052e-02,
-1.5237e-01, -9.6367e-02, 1.1343e-01]], device=‘cuda:0’,
grad_fn=), 1: tensor([[ 0.1787, 0.0171, 0.0090, -0.0619, -0.0658, 0.0561, 0.0311, -0.0270,

In my code snippet I’ve created the mean tensor for each class in a dict.
Are you concerned about the tensor for each class or is the mean tensor shape for each class having an unexpected shape?

1 Like

Thanks Ptrblblck,
I am confused about about the mean tensor shape for each class. For example the mean of 10, 20 and 30 is : (10+20+30)/3=20. So just wondering why we are getting multiple values instead of single. Apologies in advance if you find it a silly question.

In my previous code snippet each feature tensor had a feature dimension of 4.
I’ve collected all feature tensors belonging to the same class in a dict and appended them to a list.
Once this was done, I’ve calculated the mean of all tensors in the “batch dimension”, which creates a mean tensor of the shape [features].

I.e. if you have 10 tensors, where each tensor has a feature dimension of 4, your tensor would have a shape of [10, 4]. Calculating tensor.mean(dim=0) creates a mean tensor in the shape [4].

1 Like

Hi ptrblck,
Could you plz guide how to get the size of mean tensors from your code (given below)

class_features_mean = {idx: [] for idx in range(nb_classes)}
for class_index in range(nb_classes):
tmp = torch.stack(class_features[class_index])
class_features_mean[class_index] = tmp.mean(dim=0)


#> {0: tensor([-0.8606, -0.1773, 0.0448, 0.6243]), 1: tensor([-1.0357, -0.2026, 1.2145, 0.2696]
It generates the following error

class_features_mean is a dict and contains the mean tensor for each class.
To print the shape of the mean tensor for class0, you could use:

1 Like

Thanks a lot ptrblck