Angular Features

Angelina_Robert · October 13, 2020, 2:45am

Hi, Just wondering if anyone has any idea about finding the angles between centers of classes and their corresponding features. I want to implement the model from the CVPR-2020 paper " Deep Representation Learning on Long-tailed Data_ A Learnable Embedding Augmentation Perspective"

Any help would be much appreciated.

ptrblck · October 13, 2020, 6:43am

I cannot find any implementation for this particular paper, but this repository seems to provide some implementations from this research area (if I’m not mistaken) so maybe you could reuse of of the code.

Angelina_Robert · October 21, 2020, 2:17pm

Hi Ptrblck,
Can you please implement the highlighted equations in PyTorch by extracting the features from the CIFAR dataset using ResNet 32 architecture? I have been trying it for the last couple of days but unable to do that.
I shall be very thankful to you for this favor.

.
Cheers,
Angelina

ptrblck · October 22, 2020, 9:52am

Could you post your current code snippets and explain where you are currently stuck, please?

Angelina_Robert · October 22, 2020, 10:58am

Dear ptrblck,
Thanks for the reply. In the following code snippet, features are being extracted at line no 386. I want to find the angles between these features and their corresponding class centers. I am stuck on how to find the classes, their centers, and angles between class centers and corresponding features. Although it is mentioned in the paper (sent before), however, I can not implement it.

ptrblck · October 22, 2020, 6:19pm

Based on the features output of the model you could start by calculating the class centers.
I don’t see a clear description how these centers are calculated so I assume they are just the mean of the features corresponding to the current class?
If so, you could use the target to index features and thus split it into different class features.
Once this is done, you could then apply the mean to generate the class centers.

Angelina_Robert · October 23, 2020, 6:58am

Hi ptrblck,
Thanks a lot for your reply, however, it all went over my head, could you please help with that?

ptrblck · October 23, 2020, 8:28am

Assuming my description is right, you could use this simple example to use the target for the current batch and index the output with it in order to append the features class-based:

nb_classes = 3
batch_size = 10
features = 4
output = torch.randn(batch_size, features)
targets = torch.randint(0, nb_classes, (batch_size,))

print(targets)
> tensor([0, 0, 0, 2, 0, 1, 1, 2, 2, 2])

class_features = {idx: [] for idx in range(nb_classes)}
for class_index in range(nb_classes):
    idx = targets == class_index
    class_feat = output[idx]
    class_features[class_index].extend(class_feat)

print(class_features)
> {0: [tensor([-0.8338,  0.3141, -0.2840,  0.6104]), tensor([-1.5458, -0.3546, -0.3190,  0.6153]), tensor([-1.3919, -0.4112,  0.4425,  0.8475]), tensor([ 0.3294, -0.2577,  0.3397,  0.4239])], 1: [tensor([-1.4398,  0.2516,  1.6932, -0.0364]), tensor([-0.6315, -0.6568,  0.7358,  0.5755])], 2: [tensor([ 1.3726, -0.7952,  2.1696,  0.6634]), tensor([ 0.4024, -2.2702,  2.2658,  2.6325]), tensor([ 0.2029, -1.4608, -0.2938,  1.0877]), tensor([-1.2883, -0.5849,  0.2535, -0.0638])]}

In the next step you could then use another loop to calculate the mean of these features:

class_features_mean = {idx: [] for idx in range(nb_classes)}
for class_index in range(nb_classes):
    tmp = torch.stack(class_features[class_index])
    class_features_mean[class_index] = tmp.mean(dim=0)

print(class_features_mean)
> {0: tensor([-0.8606, -0.1773,  0.0448,  0.6243]), 1: tensor([-1.0357, -0.2026,  1.2145,  0.2696]), 2: tensor([ 0.1724, -1.2778,  1.0988,  1.0799])}

Note that you could of course speed up this code, but I would recommend to stick to loops for now until you are sure the method works as intended.

Angelina_Robert · October 24, 2020, 5:00am

Thank you very much ptrblck, I will try to incorporate it in my work.

Heaps, heaps of thanks and stay blessed.

Cheers.

Angelina_Robert · October 24, 2020, 4:06pm

Hi, ptrblck,
I tried to incorporate that logic into my work, however, in my case, the dimension of features is the Batch size(128),2,128 instead of 4 (as mentioned in your example). I think due to this it is causing problems. Could you please have a look at the code and corresponding output as shown in the following pics
Code Snippet

Output

ptrblck · October 25, 2020, 1:20am

What is the shape of targets?
PS: you can post code snippets by wrapping them into three backticks ```, which makes debugging easier.

Angelina_Robert · October 27, 2020, 2:41am

Hi ptrblck,
By targets, do you mean the total number of classes or train targets. I am using Cifar10 dataset then the targets size would be 10.

ptrblck · October 27, 2020, 7:44am

In that case could compare your current tensors including their shape to my code snippet, as it’s working? If you get stuck, could you post an executable code snippet to reproduce this issue?

Angelina_Robert · October 28, 2020, 7:24am

Thanx ptrblck for your reply. I am really stuck after spending a couple of days on this problem.
What I want to do:
I want to find the average angular variance of head classes( say classes with samples greater than 500 are head classes, others are tail classes) and then transform it to tail classes. I think the solution proposed by you in the last post is fine. So initially I want to extract the features from all classes then associate those features to the corresponding classes and then find the angles between features and corresponding class center. The solution that you provided also makes sense to me, however, in my case feature size is 128,2,128 instead of 4. Could you please, help with this, I shall be very thankful to you.

The corresponding code -snippet is given below.

def train(train_loader, model, criterion, optimizer, epoch, opt, train_targets):

“”“one epoch training”""

model.train()

batch_time = AverageMeter()
data_time = AverageMeter()
losses = AverageMeter()

end = time.time()
for idx, (images, labels) in enumerate(train_loader):
    data_time.update(time.time() - end)

    images = torch.cat([images[0].unsqueeze(1), images[1].unsqueeze(1)],
                       dim=1)
    
    images = images.view(-1, 3, 32, 32).cuda(non_blocking=True)
   # print('..............Image Size...........',images.shape)
    labels = labels.cuda(non_blocking=True)
   # print('..............Label Size...........',labels.shape)
    bsz = labels.shape[0]
         
  
    # compute loss
    features = model(images)
    #print('....................Size of Features (Before)............',features.shape)
       
    features = features.view(bsz, 2, -1)     # feature size 128,2,128
   
    # For Angular variance 
    if opt.dataset == 'cifar10':
        nbclasses=10
        print('Dataset :  Cifar10')
    elif opt.dataset == 'cifar100':
        nbclasses=100
        print('Dataset : Cifar100')
    # bsz: batch_size
    #output = torch.randn(batch_size, features)
   
    output=torch.randn(features.shape)
                         
    targets = torch.randint(0, nbclasses, (bsz,)).cuda(non_blocking=True)
        
    class_features = {idx: [features] for idx in range(nbclasses)}
    
       
    for class_index in range(nbclasses):
        idx = targets == class_index
        class_feat = output[idx]
        class_features[class_index].extend(class_feat)
        
    loss = criterion(class_features, labels)

ptrblck · October 28, 2020, 8:07am

What do these dimensions mean? E.g. is one of them the batch dimensions and others feature dimensions? What would a size of 4 for dim1 mean and why are you currently calling features.view(bsz, 2, -1)?

As already said, I’m not familiar with this paper, but can assist in debugging some code, if you can explain what you are trying to each achieve and what is currently not working.

Angelina_Robert · October 28, 2020, 3:38pm

Dear ptrblck,
Thanks for your reply, I have managed to incorporate your code and logic into mine and have executed it successfully with feature dimension of 128,2,128. However, I still can’t understand the logic of calculating the mean values, i.e according to my understanding mean value should be a single value but it displays the following output. It displays such 10 tensors for mean values.
Class Feature Mean…: {0: tensor([[ 1.8197e-01, 2.5143e-02, -1.0864e-04, -7.5293e-02, -6.2245e-02,
6.6454e-02, 2.2286e-02, -3.5217e-02, 5.2823e-02, 2.7451e-02,
8.6874e-02, 5.1490e-02, -6.2997e-02, -6.6158e-02, -9.9687e-02,
9.7800e-02, 5.1059e-02, 5.1787e-02, 2.3591e-02, 2.5489e-02,
2.0475e-01, 6.9972e-02, -5.2651e-02, 6.5951e-02, 4.6435e-02,
-7.2646e-02, -2.6367e-02, 5.7257e-02, -6.5422e-03, -2.4042e-02,
5.7473e-02, 6.9909e-02, -1.8475e-01, 4.5108e-02, -1.8931e-02,
1.1304e-01, -1.4951e-01, 5.7364e-02, 5.5015e-02, 3.4561e-02,
-7.1705e-02, -5.5109e-02, -6.4743e-02, -1.4122e-01, 1.0535e-01,
-4.5873e-02, 2.1184e-02, -5.6603e-02, 2.8610e-02, 1.4033e-01,
5.9965e-03, -4.0568e-03, 1.5643e-02, -1.5262e-01, -1.7967e-02,
1.3538e-01, -8.6360e-02, 1.5044e-01, 6.4374e-02, -1.8130e-01,
-4.8363e-02, -2.3257e-02, -1.8913e-02, -4.0417e-02, -6.0418e-03,
1.0047e-02, 6.2920e-02, -1.4348e-02, -4.8396e-02, 1.4185e-01,
1.3855e-01, -4.7415e-02, 1.1054e-01, -2.4817e-03, 5.2295e-02,
-2.3096e-02, 1.8722e-01, 1.0053e-01, 7.2813e-02, 2.9129e-02,
1.3816e-01, -6.6317e-02, -1.3940e-01, 2.5105e-02, -4.2261e-02,
1.8815e-02, 4.5387e-02, -3.9552e-02, -7.7882e-02, -6.2132e-02,
2.9124e-02, 3.7498e-03, -8.9862e-02, -4.3700e-02, 7.1030e-02,
6.5240e-02, 3.0205e-02, 2.6702e-03, 3.6049e-02, 9.3761e-02,
3.6270e-02, -1.1766e-01, -1.6088e-01, 4.6290e-02, -8.2123e-02,
2.4309e-03, 1.1423e-01, -5.4024e-02, 1.7386e-02, -1.0836e-02,
1.7031e-01, 3.2585e-02, 1.1521e-01, 3.0698e-02, 9.9960e-02,
-1.7035e-02, -7.7930e-02, 1.8263e-01, 1.0334e-01, 1.8118e-01,
4.8566e-02, -7.1670e-02, -1.2270e-01, -1.3332e-01, -1.5380e-02,
-1.5914e-01, -1.0167e-01, 1.1124e-01],
[ 1.8200e-01, 3.9448e-02, 3.7286e-03, -6.7564e-02, -4.8194e-02,
6.0452e-02, 2.4397e-02, -2.2919e-02, 5.0559e-02, 4.2893e-02,
8.5817e-02, 5.7405e-02, -7.3969e-02, -7.0352e-02, -9.5953e-02,
9.1239e-02, 4.4439e-02, 4.9519e-02, 2.0429e-02, 2.1237e-02,
1.9860e-01, 6.6372e-02, -3.8266e-02, 6.7373e-02, 5.0183e-02,
-7.4488e-02, -1.9873e-02, 6.0643e-02, -1.4473e-02, -2.5556e-02,
6.8193e-02, 6.7811e-02, -1.7251e-01, 5.6405e-02, -1.1552e-02,
1.1177e-01, -1.4240e-01, 5.5175e-02, 5.4221e-02, 3.0813e-02,
-6.9005e-02, -5.7479e-02, -5.5988e-02, -1.4929e-01, 1.0589e-01,
-6.4639e-02, 7.6206e-03, -7.1328e-02, 2.3568e-02, 1.5242e-01,
6.8193e-03, 1.3356e-03, 1.5267e-02, -1.5220e-01, -2.4627e-02,
1.2660e-01, -9.6360e-02, 1.3858e-01, 5.2747e-02, -1.8939e-01,
-5.4749e-02, -2.7353e-02, -1.8591e-02, -5.3250e-02, 4.5380e-03,
5.6123e-03, 6.6939e-02, -1.6745e-02, -6.5302e-02, 1.2999e-01,
1.4105e-01, -4.6737e-02, 9.6890e-02, -8.0185e-03, 5.5306e-02,
-2.4646e-02, 1.8729e-01, 1.0327e-01, 5.8818e-02, 2.0035e-02,
1.3873e-01, -7.6202e-02, -1.4655e-01, 3.3630e-02, -6.1461e-02,
2.0106e-02, 3.6875e-02, -5.0162e-02, -8.8452e-02, -6.1024e-02,
4.8275e-02, 1.8975e-02, -9.7968e-02, -3.9750e-02, 6.8793e-02,
5.6112e-02, 3.6727e-02, 4.3991e-03, 2.7574e-02, 9.6684e-02,
4.8077e-02, -1.3928e-01, -1.6517e-01, 4.6624e-02, -8.2189e-02,
-6.7808e-03, 9.8721e-02, -6.0551e-02, 1.1518e-02, -1.7801e-02,
1.7948e-01, 2.1270e-02, 1.1515e-01, 4.2319e-02, 8.7374e-02,
-3.0091e-02, -9.1005e-02, 1.8670e-01, 1.0329e-01, 1.9839e-01,
5.2998e-02, -7.2922e-02, -1.3568e-01, -1.1318e-01, -2.0052e-02,
-1.5237e-01, -9.6367e-02, 1.1343e-01]], device=‘cuda:0’,
grad_fn=), 1: tensor([[ 0.1787, 0.0171, 0.0090, -0.0619, -0.0658, 0.0561, 0.0311, -0.0270,

ptrblck · October 29, 2020, 1:24am

In my code snippet I’ve created the mean tensor for each class in a dict.
Are you concerned about the tensor for each class or is the mean tensor shape for each class having an unexpected shape?

Angelina_Robert · October 29, 2020, 4:29am

Thanks Ptrblblck,
I am confused about about the mean tensor shape for each class. For example the mean of 10, 20 and 30 is : (10+20+30)/3=20. So just wondering why we are getting multiple values instead of single. Apologies in advance if you find it a silly question.

ptrblck · October 29, 2020, 6:52am

In my previous code snippet each feature tensor had a feature dimension of 4.
I’ve collected all feature tensors belonging to the same class in a dict and appended them to a list.
Once this was done, I’ve calculated the mean of all tensors in the “batch dimension”, which creates a mean tensor of the shape [features].

I.e. if you have 10 tensors, where each tensor has a feature dimension of 4, your tensor would have a shape of [10, 4]. Calculating tensor.mean(dim=0) creates a mean tensor in the shape [4].

Angelina_Robert · October 31, 2020, 3:50pm

Hi ptrblck,
Could you plz guide how to get the size of mean tensors from your code (given below)

class_features_mean = {idx: for idx in range(nb_classes)}
for class_index in range(nb_classes):
tmp = torch.stack(class_features[class_index])
class_features_mean[class_index] = tmp.mean(dim=0)

print(class_features_mean)

print(‘class_features_mean…’,class_features_mean.shape)
#> {0: tensor([-0.8606, -0.1773, 0.0448, 0.6243]), 1: tensor([-1.0357, -0.2026, 1.2145, 0.2696]
It generates the following error