Background as a Class

In the TorchVision Object Detection Finetuning Tutorial, num_classes is set as 2, as person and background. Is it general thing to set number of classes as real number of classes plus 1? Or this example is true for image segmentation?
My project is related with object detection not image segmentation and I have four number of classes in my dataset. So should I add background as class or not?