What is a soft label? Is soft label different from general CE in the Teacher model?What is a soft label? Is soft label different from general CE in the Teacher model?

What is a soft label? Is soft label different from general CE in the Teacher model?

I suppose you are referring to Knowledge Distillation/Model compression.

A model’s output(called teacher model in this context) is what generally referred to as soft label. It is called soft because the output may not be strictly something like [1, 0, 0] for a 3-class classification task, instead it might something like [0.85, 0.1, 0.05].

This soft label is used to train a much smaller network(called student model) instead of using the hard targets.

Hope this makes sense. Thanks.

1 Like

Thank you for your kind reply.

1 Like