douglasrizzo
(Douglas De Rizzo Meneghetti)
June 24, 2021, 10:09am
1
The documentation for the SSD
class mentions that we should not count the background as an object class when passing the number of classes as a parameter to instantiate an SSD object.
- scores (Tensor[N]): the scores for each detection
Args:
backbone (nn.Module): the network used to compute the features for the model.
It should contain an out_channels attribute with the list of the output channels of
each feature map. The backbone should return a single Tensor or an OrderedDict[Tensor].
anchor_generator (DefaultBoxGenerator): module that generates the default boxes for a
set of feature maps.
size (Tuple[int, int]): the width and height to which images will be rescaled before feeding them
to the backbone.
num_classes (int): number of output classes of the model (excluding the background).
image_mean (Tuple[float, float, float]): mean values used for input normalization.
They are generally the mean values of the dataset on which the backbone has been trained
on
image_std (Tuple[float, float, float]): std values used for input normalization.
They are generally the std values of the dataset on which the backbone has been trained on
head (nn.Module, optional): Module run on top of the backbone features. Defaults to a module containing
a classification and regression module.
score_thresh (float): Score threshold used for postprocessing the detections.
nms_thresh (float): NMS threshold used for postprocessing the detections.
detections_per_img (int): Number of best detections to keep after NMS.
However, further down in the same file, an SSD object is instantiated in a function that explicitly says that the background should be counted as an object class, but this is not taken into account in the code (i.e. I did not see num_classes
be decremented by one when creating the SSD object).
anchor_generator = DefaultBoxGenerator([[2], [2, 3], [2, 3], [2, 3], [2], [2]],
scales=[0.07, 0.15, 0.33, 0.51, 0.69, 0.87, 1.05],
steps=[8, 16, 32, 64, 100, 300])
defaults = {
# Rescale the input in a way compatible to the backbone
"image_mean": [0.48235, 0.45882, 0.40784],
"image_std": [1.0 / 255.0, 1.0 / 255.0, 1.0 / 255.0], # undo the 0-1 scaling of toTensor
}
kwargs = {**defaults, **kwargs}
model = SSD(backbone, anchor_generator, (300, 300), num_classes, **kwargs)
if pretrained:
weights_name = 'ssd300_vgg16_coco'
if model_urls.get(weights_name, None) is None:
raise ValueError("No checkpoint is available for model {}".format(weights_name))
state_dict = load_state_dict_from_url(model_urls[weights_name], progress=progress)
model.load_state_dict(state_dict)
return model
Here is the documentation for this function, which says we should include the background in the number of classes.
Example:
>>> model = torchvision.models.detection.ssd300_vgg16(pretrained=True)
>>> model.eval()
>>> x = [torch.rand(3, 300, 300), torch.rand(3, 500, 400)]
>>> predictions = model(x)
Args:
pretrained (bool): If True, returns a model pre-trained on COCO train2017
progress (bool): If True, displays a progress bar of the download to stderr
num_classes (int): number of output classes of the model (including the background)
pretrained_backbone (bool): If True, returns a model with backbone pre-trained on Imagenet
trainable_backbone_layers (int): number of trainable (not frozen) resnet layers starting from final block.
Valid values are between 0 and 5, with 5 meaning all backbone layers are trainable.
"""
if "size" in kwargs:
warnings.warn("The size of the model is already fixed; ignoring the argument.")
trainable_backbone_layers = _validate_trainable_layers(
pretrained or pretrained_backbone, trainable_backbone_layers, 5, 5)
This is confusing. Should we or should we not count the background as an object class when instantiating the SSD? In either case, how should object classes be ID’d during training?
As an example, with Faster RCNN, the background is counted as an object class (with ID 0 reserved for it) and actual object classes are identified during training starting from ID 1. What should be the procedure for SSD?
1 Like
douglasrizzo
(Douglas De Rizzo Meneghetti)
June 28, 2021, 2:57pm
2
1 Like