I have a question about the argmax function when I put :
torch.argmax (output, 1) I get a result:
grad_fn = NotImplemented and when I tried:
torch.argmax (output, 1) .float (). requires_grad_ (True)
it shows me
grad_fn = CopyBackwards.
so can you explain to me what exactly does that mean
Thank you in advance
argmax is not mathematically differentiable without relaxations
thanks for your reply
I didn’t understand your explanation, can you explain more please.
Another way to say it is that
argmax() is not usefully differentiable.
torch.argmax (torch.FloatTensor ([x, 1.0])).
argmax() will be
x < 1.0 and
x > 1.0. In both cases
its derivative (gradient) with respect to
x will be zero, and it won’t
be mathematically differentiable right at
x = 1.0.
Having zero gradient almost everywhere isn’t useful for gradient
descent optimization, so pytorch doesn’t bother to implement
thank you for your explanation K. Frank