The derivative for 'eig' is not implemented

hello

my source

# ReEig Layer
def cal_rect_cov(features):
#     features=features.detach().cpu()
    result=[]
#     features=torch.tensor(features,requires_grad=True)
    for i in range(features.shape[0]):
        s_f,v_f=torch.eig(features[i], eigenvectors=True)
        s_f_convert=[]
        for i in range(len(s_f)):
            s_f_convert.append(s_f[i][0])
        s_f_convert = torch.stack(s_f_convert)
        s_f2=torch.clamp(s_f_convert,0.0001,10000)
        s_f2=torch.diag(s_f2)
        features_t=torch.matmul(torch.matmul(v_f,s_f2),v_f.t())
        result.append(features_t)
    result = torch.stack(result)
#     result=result.cuda()
    return result

loss calculate is successful

but

loss.backward()

in this source, error like this

the derivative for ‘eig’ is not implemented

i don’t know why this error occured.

plz help me

thank you

Hi James!

Gradients (“backward pass”) are not implemented for
torch.eig(). If your features matrix is symmetric
(or can naturally be made symmetric), you can use
torch.symeig(), otherwise you will have to rethink
your loss function.

From the documentation for torch.eig():

Since eigenvalues and eigenvectors might be
complex, backward pass is supported only for
torch.symeig()

Also note, from torch.symeig():

Extra care needs to be taken when backward through
outputs. Such operation is really only stable when
all eigenvalues are distinct. Otherwise, NaN can
appear as the gradients are not properly defined.

Let me just say that there is nothing fundamentally improper
about using eigenvalues (or eigenvectors) as ingredients in
your loss function, but eigenvalues can be a little tricky, both
in terms of how they depend mathematically on your inputs,
and sometimes the numerical stability of the algorithms used
to calculate them. So you should think through carefully how
and why you’re using them.

Cheers!

K. Frank

2 Likes

thank you for your answer.

I solved this error but

you already worry about gradient print none value.

I check my gradient eigenvalue,eigenvector(s_f.grad,v_f.grad).

only print none value.

so I consider more backward about symeig.

I studied about "Covariance Pooling for Facial Expression Recognition
"

this paper offered tensorflow code.

every thing changed pytorch.

but just problemed om symeig.

any other idea about symeig do not get gradient?

thank you for your answer again.

-James

Hi James!

First a comment:

It sounds like you are trying to translate the “Covariance Pooling”
algorithm from tensorflow to pytorch. This might well involve
some amount of work, but I would expect it to be doable.

And a question:

It sounds like you are saying that symeig is giving you “none” for
the value of its gradient. Is this the issue you are asking about?
(I should note that I haven’t ever used symeig.)

Could you try to isolate this problem and post a small, complete,
runnable pytorch script where you evaluate symeig on some
inputs, call backward(), and then print out the gradients you
get?

If we start with something simple like that, we can try to analyze
your gradient issue.

Best.

K. Frank

Thank you for your reply.

I check my gradient eigenvalue,eigenvector(s_f.grad,v_f.grad).

only print none value.

this mean that during training stage,

I printed eigenvalue’s grad and eigenvector’s grad.

in training stage, they printed None value.

Hello James!

Well, if you’re getting “none” gradients during training, there’s
almost certainly a bug somewhere, more likely in your code,
but possibly, though less likely, in pytorch.

If you don’t find the bug fairly quickly just looking through your
code, then more orderly debugging is called for. A common
approach is to break thing up into pieces to try and isolate the
bug.

As I suggested in my previous post, you might try just seeing
if you can get a gradient out of symeig. My suggestion would
be that you post a short, complete, runnable pytorch script
that pumps a tensor (with requires_grad set) through
symeig and somehow calculates a scalar (single number)
from that. Then call backward() on that scalar and see if
your input tensor has good gradients or just “none”.

If you post the script and your results, forum participants can
not only look at it, but run it for themselves, and maybe they
will see something or have ideas about what is going on.

Good luck.

K. Frank