AUTOGRAD for the argmax function

Hello,
I have a question about the argmax function when I put :
torch.argmax (output, 1) I get a result: grad_fn = NotImplemented and when I tried:

torch.argmax (output, 1) .float (). requires_grad_ (True)

it shows me

 grad_fn = CopyBackwards.

so can you explain to me what exactly does that mean
Thank you in advance

argmax is not mathematically differentiable without relaxations

thanks for your reply
I didn’t understand your explanation, can you explain more please.

Hello Massivaa!

Another way to say it is that argmax() is not usefully differentiable.

Consider torch.argmax (torch.FloatTensor ([x, 1.0])).
argmax() will be 1 for x < 1.0 and 0 for x > 1.0. In both cases
its derivative (gradient) with respect to x will be zero, and it won’t
be mathematically differentiable right at x = 1.0.

Having zero gradient almost everywhere isn’t useful for gradient
descent optimization, so pytorch doesn’t bother to implement
autograd (grad_fn) for argmax().

Best.

K. Frank

2 Likes

thank you for your explanation K. Frank :blush: