Source | Paper

ModelA: `nn.CrossEntropyLoss()`

→ Is used for **Classification**.

ModelB: `nn.SmoothL1Loss(reduction='mean')`

→ Is used for **regression**.

I am having two questions:

**1) How to combine the two models?**

**2) How should the**

`distance_regressor`

look like together with the `loss_dist_fn`

Referring to 1) Combining two `model`

s

```
class_dist_parameters = list(classifier.parameters()) + list(distance_regressor.parameters())
optimizer = torch.optim.Adam(class_dist_parameters, lr=0.001)
```

Within my training_step I do everything serialized:

Like:

```
classifier.train()
distance_regressor.train()
#within the for_loop:
for batch_out, (image_dl, rois_dl, class_label_dl, distance_label_dl, df_train_dl) in enumerate(dataloader):
distance_label_dl = distance_label_dl.reshape(distance_label_dl.shape[0], 1)
output_class_pred = classifier(rois_out)
roi_output_distance_pred = distance_regressor(rois_out)
output_distance_pred = softplus(roi_output_distance_pred)
# 2. Calculate and accumulate loss
#size mismatch (got input: [9], target: [8]) → fixed with reshape.
loss_class = loss_fn_class(output_class_pred, class_label_dl)
train_loss_class += loss_class.item()
loss_distance = loss_fn_distance(output_distance_pred), distance_label_dl)
train_loss_distance += loss_distance.item()
# 3. Optimizer zero grad
optimizer.zero_grad()
# 4. Loss backward
#loss.backward()
loss_class.backward()
loss_distance.backward()
# 5. Optimizer step
optimizer.step()
# Calculate and accumulate accuracy metric across all batches
y_pred_class_no = torch.argmax(torch.softmax(output_class_pred, dim=1), dim=1)
#######################################
# Not sure how to continue with the distance_regressor
#######################################
y_pred_distance_no = output_distance_pred
#y_pred_distance_no = torch.argmax(torch.softplus(output_distance_pred), dim=1)
### Classifier seems to work like this.
train_acc_class += (y_pred_class_no == class_label_dl).sum().item()/len(output_class_pred)
###
train_rmse_acc_distance += ((torch.pow((y_pred_distance_no - distance_label_dl), 2)).sum())/len(output_distance_pred)
```

Anything I forgot, or special about combining the two models?

Referring to

**2) How should the distance_regressor look like together with the loss_dist_fn**

```
def build_distance_regressor(in_feature_size, first_layer_size, second_layer_size, out_feature_size):
distance_regressor = nn.Sequential(
nn.Flatten(1, -1),
nn.Linear(in_features=in_feature_size, out_features=first_layer_size),
nn.Linear(in_features=first_layer_size, out_features=second_layer_size),
nn.Linear(in_features=second_layer_size, out_features=out_feature_size),
)
return distance_regressor.to(device)
```

with

```
dist_regressor= build_distance_regressor(in_feature_size=214016, first_layer_size=2048, second_layer_size=512, out_feature_size=1)
```

But …

Right now I don’t get any improvements, regarding to `distance_regressor`

.

model_output is a big negative value; which means that `softplus`

is making zeros out of it, and even after 20 Epochs; nothing changes.

`classifier`

seems to work in single use (simple Linear-Layer).

So I am thinking about to normalize the distance_targets with `min–max`

.

Any ideas how to get the `distance_regressor`

working?

Maybe you have an idea? @ptrblck

P.S. I already consulted this topic: A model with multiple outputs