All–

Say I have a grayscale (non-binary) image with some object, and I’d like to determine its major axis. I could binarize and perform PCA on the foreground components only, but I’m wondering if I can instead perform a weighted PCA based on the pixel values, similar to what is described here [1]. Here’s the implementation I’ve come up with based on that post:

```
import pylab as plt
import torch as t
import numpy as np
def weighted_pca(input_mat, weights):
weights_sum = weights.sum()
weights_doubled = weights.repeat((2, 1))
mean = t.sum(input_mat*weights_doubled, axis=1) / weights_sum
matrix_centered = input_mat - mean.unsqueeze(1).float()
print(weights.shape, matrix_centered.shape)
# This line attempts to allocate a massive amount of memory (256 GB when s=512) and fails.
inner = t.matmul(t.diag(weights), matrix_centered.transpose(1, 0))
# Using sparse matrices also fails.
# inner = t.sparse.mm(t.diag(weights).to_sparse_csr(), matrix_centered.transpose(1, 0))
weighted_cov = t.matmul(matrix_centered, inner) / weights_sum
u,e,v = t.linalg.svd(weighted_cov)
return u, e, v, mean
if __name__ == "__main__":
s = 512
# Dummy binary data -- real data would vary between 0 and 1.
img = t.diag(t.ones((s)))
points = np.stack(np.meshgrid(*(np.arange(s), np.arange(s)))).reshape(2, -1)
u, e, v, mean = weighted_pca(t.Tensor(points).cuda(), img.flatten().cuda())
plt.imshow(img, origin="lower")
plt.quiver(*mean, u[0,0], u[0,1])
plt.show()
```

This works great for a small matrix size (say 256 x 256) but requires allocation of a very large amount of memory during calculation of the covariance matrix for larger sizes (say 512 x 512). I tried converting the diagonal matrix to a sparse matrix, but this didn’t help. I’m wondering if anyone can see a memory efficient way of performing this calculation, or maybe if there’s another way I could formulate this problem to take advantage of the built in `pca_lowrank`

implementation.

Happy to provide more details if necessary – thanks so much in advance to everyone who has built this incredible software and maintains this incredibly helpful forum.

Best,

–Davis

[1] pca - Weighted principal components analysis - Cross Validated