Gradients with Argmax in PyTorch

colesbury · October 10, 2017, 4:58am

Do you have a pointer to the model in different frameworks? The derivative of argmax is zero nearly everywhere, so it doesn’t seem likely that you can back-propagate through it in a way that is useful.