Hello. I am currently in the process of creating a GRU model to guess the location of a user in a simulation. I have quantized the 100x100 2D location into a matrix of 0s and 1s.

Ex:

Agent is in location x = 5 in a 1D map size of 10:

encoded label tensor = [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]

Based on this, I am trying to write a differentiable loss that is based on the distance between the highest probable encoded output position and label:.

Ex:

Agent is in location x = 5 in a 1D map size of 10:

encoded label tensor = [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0] => argmax = 5th cell

output tensor = [0.5, 0.1, 0.2, 0.1, 0.2, 0.6, 0.3, 0.4, 0.4, 0.2] => argmax = 6th cell

Calculated distance = 5 - 6.

This is a simple example and I work on a 2D space so I have to calculate the Euclidean distance. I understand how the argmax argument isn’t differentiable but I am thinking of a way to incorporate this distance value into the loss so that my penalties make sense.

I tried using the Gumbel function to create the one-hot-encoded output tensor but not sure where to go from there since I still would need an argmax.

Sorry if the problem is confusing or if my explanation of it isn’t good enough.

Any help is appreciated.