Not exactly, as with two classes the loss should be about ~0.693 for “random” guessing where the model is outputting ~0.5 for each sample. In general we would expect the model at initialization to yield a loss of ln(num_classes), which is also why e.g., models trained on ImageNet typically start at a loss of about ln(1000) or 6.9.