Hi, I am new to Pytorch and I am trying to get its basics. The first project I will need it for is one involving a function approximation. My problem is the following: I have a robot that can radically change its morphology and the respective simple PID controller with the (P, I, D) gains.
Therefore, I would like to use a Neural Network (NN) that is able to approximate a function that goes from (size, inertia, … ) to (P, I, D).
I will create the dataset on which to train all in simulation, for now, therefore I can generate quite a lot of data.
I still do not have a clean dataset which has really the correspondence (size, inertia, … ) —> (P, I, D) but I am working on it.
I know that this problem might be a better fit for a reinforcement learning implementation or even just a simple regression. However, to get my hands dirty with Pytorch, I thought that it is a nice problem to start with (and I will need it for my research).
So far I followed the main Pytorch tutorials (here) and this one too. I would like to do some more oriented to the problem described. Could you please suggest me a path to follow?
Thanks in advance!
The use case sounds interesting!
What kind of format does your data have or are you a bit flexible?
The format (e.g. a fixed number of input features, temporal data, spatial data etc.) would determine the first path and model architecture you could try.
Hi, thanks a lot on your answer!
I am indeed flexible since I am working in simulation and able to generate data. The input features, for now, are mass, inertia, size, and some others I am thinking about related to the motors of the robot.
Regarding temporal/spatial data, if I understand what that means, I am not looking at the evolution of the robot in time and neither depending on its position in space. The robot simulation is starting again and again and the robot is always in the same initial state. I plan to change that in the future. However, for what I need now I think it is enough. Is it what you meant or should I tell you something else?
I think you clarified your approach.
For this static data, I would recommend to start with a vanilla fully-connected model (using linear layers).
I assume the features have different ranges, so I would recommend to normalize them such that each feature has a zero mean and unit variance.
inertia seem to be floating point features, it seems
size might be a discrete feature.
If that’s the case, you could think about using an
nn.Embedding layer to transform this particular feature to a dense representation and concatenate it with the output activation for the other features.
Also, since you are working on a new architecture, I would recommend to start simple regarding the model architecture as well as the dataset. E.g. you could use with two linear layers only and use a small dataset of just 10 samples to make sure your model overfits this sample. Once this is done, you should scale the problem up.
It can get quite tricky if you start with a very deep model and the complete dataset, as you have a lot of variables to play around with.
Thanks for the really detailed answer.
I want to add one thing: the feature
size is also a floating point. Therefore, from what you said it seems that it will be even easier since it is not different from the other features.
I will try to implement what you said and write here some advancements.