How do we detect an object that is partially visible in a picture? I am working on Faster RCNN, YOLOv5, Swin-L Transformer, EfficentNet etc. and is there any method to detect objects that are partially visible in the picture? The dataset contains images from videos, so maybe tracking works someway but I do not know how to do that.
As a general rule you want the images in your training set to be
as representative as possible of the images on which you want
to perform inference.
So you want to train on a set of images that includes a good
assortment of occluded (“partially visible”) objects.
Consider training a cow detector, and, for the sake of argument,
let’s assume that the cows in your images are always facing
sideways to the camera. Let’s say that all the cows in your training
images are full, unoccluded cows. It is plausible that your model
might “learn” that a cow has a head, a tail, and four legs, but
doesn’t pay attention to the shape of the cow’s head, or its ears,
or its nose.
If you present your model with an inference image of a cow sticking
its head out from behind a barn, it’s very reasonable to expect that
your model won’t detect that cow. But if you had trained your model
on a mixture of cows that included both full cows and cows peeking
their heads out from behind barns, your model would likely have
learned that some cows have heads and tails and four legs, but that
other cows have heads with ears and noses of certain shapes, and
would likely be successful at detecting occluded cows.
We have developed a simulation (that imitate a city) to provide data using game development so if we take pictures as you said above the model should work fine. I will inform about this. Thanks.