When using CrossEntropyLoss for segmentation,How to encode labels into single channel target image if value <= C is a must-be?

CrossEntropyLoss doc say the target value must be:
And I have 4 type of objects which must be segmented,in this case C==4,and if I want to use single channel target image where all values must be <=4,if 0 for background,1 for the first type of objects…and How to encode the FOURTH type of objects?Or should outputs must be N+1(treating background as a class),in my case,the output channels will be 5(4+background)?

You would treat the background as a separate class, thus your target mask will have values in the range [0, 4].