AUTOGRAD for the argmax function

Massivaa · August 8, 2020, 8:15pm

Hello,
I have a question about the argmax function when I put :
torch.argmax (output, 1) I get a result: grad_fn = NotImplemented and when I tried:

torch.argmax (output, 1) .float (). requires_grad_ (True)

it shows me

 grad_fn = CopyBackwards.

so can you explain to me what exactly does that mean
Thank you in advance

smth · August 9, 2020, 1:20am

argmax is not mathematically differentiable without relaxations

Massivaa · August 9, 2020, 1:24pm

thanks for your reply
I didn’t understand your explanation, can you explain more please.

KFrank · August 9, 2020, 2:14pm

Hello Massivaa!

Another way to say it is that argmax() is not usefully differentiable.

Consider torch.argmax (torch.FloatTensor ([x, 1.0])).
argmax() will be 1 for x < 1.0 and 0 for x > 1.0. In both cases
its derivative (gradient) with respect to x will be zero, and it won’t
be mathematically differentiable right at x = 1.0.

Having zero gradient almost everywhere isn’t useful for gradient
descent optimization, so pytorch doesn’t bother to implement
autograd (grad_fn) for argmax().

Best.

K. Frank

Massivaa · August 9, 2020, 8:43pm

thank you for your explanation K. Frank