I was following this tutorial from pytorch
unfortunatly due to being a new user i can only put one image here. So i cant provide clean examples of images from the dataset in this post, only the model output
I’m working on a problem: there are metal parts (in this case metal nuts) scattered in a box. Some of them are unobscured and lie ~flat (within angle threshold), while some lie on the edge or are obscured by other parts. I need to detect the first class of parts.
After finetuning resnet50 im getting results like this:
Model output
In this particular dataset i have 100 images, i did try with a different one that has 1000 images (but the quality wasnt very realistic so i scraped that)
I mainly have these questions:
- what loss function is used in this tutorial? from my digging in source code it seems to be calculated somewhere very deep within the model class and theres no way to set loss function that i want
To that end, whatever is being outputed as loss looks like this over 100 epochs:
Starts at 4.65, drops to 4.50 over first 15 epochs and then oscillates there till the end.
-
Why are boudning boxes such a mess? the masks in the output are not too bad, but the bounding boxes are all over the place
-
Am i even using the right model? in the tutorial examples the masks didnt necessarily outline every person in the picture, only the fully visible ones so it seemed like what im looking for, but the results are still pretty bad
any advice from your experience working on similar problems is also appritiated