Loss function for objectiveness, classes, and one bounding box

I’m trying to correctly train a classification/localization (a single bounding box) network.

Input is an image and the output is 11 values:

>  0: object_exists:  0 or 1
>  1: is_bird:        0 or 1    want to ignored if truth-object_exists==0
>  2: is_cat:         0 or 1    want to ignored if truth-object_exists==0
>  3: is_dog:         0 or 1    want to ignored if truth-object_exists==0
>  4: is_ladybug:     0 or 1    want to ignored if truth-object_exists==0
>  5: has_wings:      0 or 1    want to ignored if truth-object_exists==0
>  6: is_colorful:    0 or 1    want to ignored if truth-object_exists==0
>  7: bbox_center_x: -1 .. 1    want to ignored if truth-object_exists==0
>  8: bbox_center_y: -1 .. 1    want to ignored if truth-object_exists==0
>  9: bbox_width:     0 .. 2    want to ignored if truth-object_exists==0
> 10: bbox_height:    0 .. 2    want to ignored if truth-object_exists==0

When there is no object (i.e. truth-object_exists==0) all the classes and bbox values are set to 0.

I’m treating all values as regression with MSELoss(), and avoiding nn.CrossEntropyLoss() since multiple classes might be correct.

Here’s the training loop I use (the validation loop does the same):

loss_function = nn.MSELoss()
# Model training loop
for data, truth in training_loader:
    data, truth = data.cuda(), truth.cuda()
    optimizer.zero_grad()
    predictions = model(data)
    loss = loss_function(predictions, truth)
    loss.backward()
    optimizer.step()

This results of the model training cause the bounding box to heavily skew to the top left corner since the object is never detected with 100% accuracy.

First question, how do I change:
loss = loss_function(predictions, truth)
to ignore the classes and bounding box if truth-object_exists==0?

Second question, how do I improve the loss function to weight the value of the bounding box to something different from the classes?

Last question, is nn.MSELoss() the most appropriate for all of the regression values?

Thank you for any guidance.