Poor model performance

Hello all,

I have been developing a model and test dataset to segment and classify trees from drone imagery. The performance (OA, Recall, etc.) has been exceptionally poor for the new model I have been attempting to use. I have been troubleshooting various things that I believe may be wrong however I am scratching my head as to how my OA is so poor at 4-5%, the previous model (similar architecture in Tensorflow) seemed to perform better with some classes reaching OA of 80-90%. While the dataset may not be one hundred percent accurate or voluminous I imagine the cause for such poor performance comes from some steps not performing correctly.

Some background:

  • The dataset it high-resolution RGB drone imagery of a forest, with two additional layers of a texture index and canopy height model (So five layers in total).
  • There are nine labeled classes being [Elm, Oak, Willow, Locust, Ash, Hackberry, Hickory, Elevation under 10m, and Canopy gaps] where the elements start at 1 (Elm) in the imagery labels
  • I am using a U-Net architecture in PyTorch and performing this in a multi-class manner for all nine classes
  • I have not included code snippets within this post as the scripts are fairly long, but have added them via a Google Drive link with some examples of both imagery and labels.

When training my validation loss and training loss follow each other for only a few epochs (< 10) before they start diverging. The training loss plateaus very quickly (< 20 epochs) and even the accuracy on the training class per epoch never seems to increase. While I don’t get any outright error the below are some things I have been looking at:

  • The indexing or the image and label get mixed up. This seemed to be the case with one of the ways I created the dataset however I have reverted to one I know is correct (made manually) and checked the outputs of the train and val data-loader and they appear to be coming into the network correctly. I have tried several different ways of sorting lists before the data-loader yet don’t see any difference.
  • Introduced image distortion through some processing. Again, I have looked at the data throughout different points of the model (correcting issues when found) yet it seems to be fine and I am able to visually display both data and labels correctly with no distortion
  • I have gone through and checked data types and dimensionality throughout multiple parts of the model and it does not seem to get distorted.
  • I have tried different learning rates, loss functions, and schedules yet they don’t seem to make an impressionable difference in the outcome
  • The dataset is inherently too variable to train on, this could potentially be true however I have had limited success in the past with this same dataset and imagine classification accuracies should be at least better than randomly assigning a pixel a class…

This is currently where I am at, in the past, I have had problems with overfitting my train data yet I don’t seem to be able to “learn” at all. I am currently training via a binary approach for each tree genus to see if that is any better and will update. Some things regarding the scripts. One only needs to run the model_load.py script to initiate the training. I am admittedly quite novice at ML and programming and don’t have any people I can talk with to help me work through these issues so I appreciate the help!