Dice score for multi class segmentation on medical images


I am working on the highly imbalanced medical image (OCT Dataset) where the ground truth for each pixel is one of these values (0 for background, and 1,2, and 3 for type of fluid). so it’s a multiclass segmentation problem.

My question is:
For the model performance check, I am using a dice score. Should I include background for dice score calculation for all (mean overall classes) or not?
If I do not include background for the dice score calculation then how should handle those pixels whose ground truth value was 0 (they only had background) but the model predicted those pixels to one of the fluids (values 1,2, or 3)?