Covariance and gradient support

ModarTensai · July 30, 2018, 2:17am

The covariance of X of shape [*, n] can be easily estimated as:
cov(X) = MM^T/(n-1) where M = X - columns_means(X).
If we looked into the implementation of np.cov, we can optimize it as follows:

import warnings
import numpy as np

def np_cov(m, rowvar=False):
    # Handles complex arrays too
    m = np.asarray(m)
    if m.ndim > 2:
        raise ValueError('m has more than 2 dimensions')
    dtype = np.result_type(m, np.float64)
    m = np.array(m, ndmin=2, dtype=dtype)
    if not rowvar and m.shape[0] != 1:
        m = m.T
    if m.shape[0] == 0:
        return np.array([], dtype=dtype).reshape(0, 0)

    # Determine the normalization
    fact = m.shape[1] - 1
    if fact <= 0:
        warnings.warn("Degrees of freedom <= 0 for slice",
                      RuntimeWarning, stacklevel=2)
        fact = 0.0

    m -= np.mean(m, axis=1, keepdims=1)
    c = np.dot(m, m.T.conj())
    c *= np.true_divide(1, fact)
    return c.squeeze()

This can be easily tested as follows:

r = False
x = np.random.rand(3, 1000)
np_c = np.cov(x, rowvar=r)
our_c = np_cov(x, rowvar=r)
print(np.allclose(np_c, our_c))

To port it to pytorch, I did the following:

import torch

def cov(m, rowvar=False):
    '''Estimate a covariance matrix given data.

    Covariance indicates the level to which two variables vary together.
    If we examine N-dimensional samples, `X = [x_1, x_2, ... x_N]^T`,
    then the covariance matrix element `C_{ij}` is the covariance of
    `x_i` and `x_j`. The element `C_{ii}` is the variance of `x_i`.

    Args:
        m: A 1-D or 2-D array containing multiple variables and observations.
            Each row of `m` represents a variable, and each column a single
            observation of all those variables.
        rowvar: If `rowvar` is True, then each row represents a
            variable, with observations in the columns. Otherwise, the
            relationship is transposed: each column represents a variable,
            while the rows contain observations.

    Returns:
        The covariance matrix of the variables.
    '''
    if m.dim() > 2:
        raise ValueError('m has more than 2 dimensions')
    if m.dim() < 2:
        m = m.view(1, -1)
    if not rowvar and m.size(0) != 1:
        m = m.t()
    # m = m.type(torch.double)  # uncomment this line if desired
    fact = 1.0 / (m.size(1) - 1)
    m -= torch.mean(m, dim=1, keepdim=True)
    mt = m.t()  # if complex: mt = m.t().conj()
    return fact * m.matmul(mt).squeeze()

It is maybe important here to note that currently pytorch doesn’t support complex tensors, so the conjugate of a real matrix is itself. Another thing is that I like the pytorch convention where each function that consumes tensors and produces others outputs tensors of the same dtype. If you don’t desire this behaviour just uncomment the type casting line. Otherwise, cast your tensor to double before passing it to the function. Finally, I prefer rowvar=False by default which is not the default behaviour of np.cov.

Here is how to test this implementation:

r = False
x = np.random.rand(3, 1000)
xt = torch.from_numpy(x).type(torch.double)
np_c = np.cov(x, rowvar=r)
our_c = cov(xt, rowvar=r).numpy()
print(np.allclose(np_c, our_c))

Because no special functions were used, we can acquire the gradient of this function with auto-differentiation.