Making Predictions for a highly Overlapping Data

Harsh_Choudhary · June 13, 2022, 10:08am

I am solving a problem where I have 4 distinct outputs. I need to train a Neural Net as a regressor so as to predict the correct value corresponding to my input features, The Problem however is that the features I am using are highly overlapping. Till now, I have tried training a Neural Net over the whole dataset but that didn’t give any promising results. I even tried using K-means first to cluster the data into 4 distinct clusters and then training a different Neural Net on each of the cluster for regression. That didn’t seem to work as well the way I wanted it to. As a brief insight to my problem: suppose the 4 targets are 0, 1, 2, 4. and I need to train a Neural Net as a regressor as I want to generalize the model as there can be a lot of values in between as well in the future. I am attaching some images which are scatter plots of my input features. The main problem I face is that my neural Net never performs good on all 4 of the classes. If the predictions for the extreme classes i.e. 0 and 4 are consistently good, the predictions for the middle classes are bad and sometimes the other way around.

Also, When I tried using the clustering first to separate the data, The 2 clusters were good but the other 2 clusters have a lot of misclassifications so my neural nets couldn’t make a good predictions, also the data to each neural net has been reduced to a fourth the data that the single neural net was getting over the whole dataset. I am also attaching the image of the clustering that I made. where each color represents item types (0,1,2,4) respectively