Source | Paper
ModelA: nn.CrossEntropyLoss()
→ Is used for Classification.
ModelB: nn.SmoothL1Loss(reduction='mean')
→ Is used for regression.
I am having two questions:
1) How to combine the two model
s?
2) How should the distance_regressor
look like together with the loss_dist_fn
Referring to 1) Combining two model
s
class_dist_parameters = list(classifier.parameters()) + list(distance_regressor.parameters())
optimizer = torch.optim.Adam(class_dist_parameters, lr=0.001)
Within my training_step I do everything serialized:
Like:
classifier.train()
distance_regressor.train()
#within the for_loop:
for batch_out, (image_dl, rois_dl, class_label_dl, distance_label_dl, df_train_dl) in enumerate(dataloader):
distance_label_dl = distance_label_dl.reshape(distance_label_dl.shape[0], 1)
output_class_pred = classifier(rois_out)
roi_output_distance_pred = distance_regressor(rois_out)
output_distance_pred = softplus(roi_output_distance_pred)
# 2. Calculate and accumulate loss
#size mismatch (got input: [9], target: [8]) → fixed with reshape.
loss_class = loss_fn_class(output_class_pred, class_label_dl)
train_loss_class += loss_class.item()
loss_distance = loss_fn_distance(output_distance_pred), distance_label_dl)
train_loss_distance += loss_distance.item()
# 3. Optimizer zero grad
optimizer.zero_grad()
# 4. Loss backward
#loss.backward()
loss_class.backward()
loss_distance.backward()
# 5. Optimizer step
optimizer.step()
# Calculate and accumulate accuracy metric across all batches
y_pred_class_no = torch.argmax(torch.softmax(output_class_pred, dim=1), dim=1)
#######################################
# Not sure how to continue with the distance_regressor
#######################################
y_pred_distance_no = output_distance_pred
#y_pred_distance_no = torch.argmax(torch.softplus(output_distance_pred), dim=1)
### Classifier seems to work like this.
train_acc_class += (y_pred_class_no == class_label_dl).sum().item()/len(output_class_pred)
###
train_rmse_acc_distance += ((torch.pow((y_pred_distance_no - distance_label_dl), 2)).sum())/len(output_distance_pred)
Anything I forgot, or special about combining the two models?
Referring to
2) How should the distance_regressor
look like together with the loss_dist_fn
def build_distance_regressor(in_feature_size, first_layer_size, second_layer_size, out_feature_size):
distance_regressor = nn.Sequential(
nn.Flatten(1, -1),
nn.Linear(in_features=in_feature_size, out_features=first_layer_size),
nn.Linear(in_features=first_layer_size, out_features=second_layer_size),
nn.Linear(in_features=second_layer_size, out_features=out_feature_size),
)
return distance_regressor.to(device)
with
dist_regressor= build_distance_regressor(in_feature_size=214016, first_layer_size=2048, second_layer_size=512, out_feature_size=1)
But …
Right now I don’t get any improvements, regarding to distance_regressor
.
model_output is a big negative value; which means that softplus
is making zeros out of it, and even after 20 Epochs; nothing changes.
classifier
seems to work in single use (simple Linear-Layer).
So I am thinking about to normalize the distance_targets with min–max
.
Any ideas how to get the distance_regressor
working?
Maybe you have an idea? @ptrblck
P.S. I already consulted this topic: A model with multiple outputs