If I understand your use case correctly, you are struggling with getting good performance out of your model using soft labels and in particular if the target is set to 0.5
? If so, have a look at this post which shows that the cost of binary crossentropy is non-zero for this use case.