Calculating covariance matrix of a set of feature vectors

Mona_Jalal · April 23, 2019, 4:17am

I previously used https://github.com/christiansafka/img2vec to create resnet50 feature vectors for each of my categories to create some statistics. Now I am noticing running the same code throws me error. I am rather much confused how to fix it as it was working and I have changed nothing (and it is the same version of PyTorch as before 0.4)

import numpy as np
from numpy import linalg as LA

infile1 = np.loadtxt('T1_resnet50_feature_vectors.txt')
infile1_reshaped = infile1.reshape(2048, 21)
cov1 = np.cov(infile1_reshaped, rowvar=False)

infile1.shape is (38, 2048)

I get error for reshaping but I need reshaping for calculating the covariance.

The code in github is using torch.

Traceback (most recent call last):
  File "feature_vectors_statistics.py", line XYZ, in <module>
    infile1_reshaped = infile1.reshape(2048, 21)
ValueError: cannot reshape array of size 77824 into shape (2048,21)

Please suggest how to calculate the covariance matrix of a file that includes feature vectors for each image in each line.

Tony-Y · April 23, 2019, 7:07am

This error is not attributed to PyTorch. You used reshape in Numpy.

I have a couple of questions. Why did you use 21 instead of 38 when reshaping infile1? Why does the reshaping be necessary for calculating the covariance?

Mona_Jalal · April 24, 2019, 12:09am

thanks a lot Tony. I made a mistake on my own end regarding 21. Your questions helped me figure out and my problem is resolved.

21 was shape of my data. My data size had increased. It is silly but I forgot to change it to 38.