Processing coordinates

OuisYasser · June 21, 2022, 11:06pm

Hello community,

Please, I am thinking about applying some idea but my input is considered to be X.Y coordinates and I am needed to extract a vector representing this input X Y

Mainly, I do have a trajectory on 2D space but I don’t have an idea about how to exploit it.

If anyone has an idea or have seen some papers on this problem please help me out

Thank you

Andrei_Cristea · June 21, 2022, 11:22pm

Hello, Yasser! Could you be a little more specific about what exactly you’re trying to calculate. In fact, it would be best if you could give a small example with actual data, so that we can understand what you need.

OuisYasser · June 21, 2022, 11:33pm

I do have a path that a pen followed in order to write some thing. Lets take the example of writing the letter A. So now my goal is to take the letter coordinates X and Y (let’s say a vector of size (2,200) ) and pass them into an ANN for example in order to have as an output one vector representing the probability of each word being written. (This is just an example to understand)
My problem is that I dont know how to do it (for example represent the coordinates in another space) or which architecture I should use.

Andrei_Cristea · June 21, 2022, 11:40pm

In the example given, at a very high level, a few ways I could imagine approaching this would be:

Use the (X, Y) path to literally draw the picture as a data processing step, then train a standard image classification model (look up ResNet as an example) that can label the resulting image. This is slow since you’re doing some not-strictly-necessary preprocessing work, but I think it will get the job done.
If doing this would make the picture very muddled (e.g. say you have touchpad information, so the (X, Y) from different timesteps overlap, and the order really matters) perhaps you can think of a (X, Y, T) input into a series of Conv3d layers. You may have to experiment to find an architecture that works here, perhaps if you search around you can find someone who has done this already.
Feed the coordinate vector directly into one or more RNN layers (e.g. LSTM) with two input features, and train it to recognize the output you’re looking for.

The right choice probably depends on the specifics of the problem.