I am wondering where to place functions like
torch.Softmax in the speicfic models exactly and why.
In model-descriptions I saw that usually people apply these kind of functions at the end of a model (after the last
For my problems it seems that it doesn’t work.
I get an output from my
vgg_feature_extractor; extract some
rois (region of interests) and send these to my two models.
Model A: Classifier. (one single
Linear-Layer; nothing else).
criterion_classification = nn.CrossEntropyLoss()
Model B: Distance_regressor (three
Linear-Layers; nothing else).
criterion_distance = nn.SmoothL1Loss(reduction='mean')
If I add the
Softmax to the end of model A and train it: The
train_acc stucks at 0,57x (not moving anymore).
Right code seems to be (only classifier)…
output_class_pred = classifier(rois_out) # 2. Calculate and accumulate loss loss = loss_fn(output_class_pred, class_label_dl) train_loss += loss.item() # 3. Optimizer zero grad optimizer.zero_grad() # 4. Loss backward loss.backward() # 5. Optimizer step optimizer.step() y_pred_class_no = torch.argmax((torch.softmax(output_class_pred, dim=1)), dim=1) train_acc += (y_pred_class_no == class_label_dl).sum().item()/len(output_class_pred) # Adjust metrics to get average loss and accuracy per batch train_loss = train_loss / len(dataloader) train_acc = train_acc / len(dataloader)
Linear-Layer is the output of
modelAin this scenario)
If I do it this way like quoted it seems to work. After a couple of epochs I get 87–89 % train_acc (with 9 classes).
So I am wondering why some people add the
Softmax to the end of their
Or where I should place it correctly …
distance_regressor I am still not sure where to place the
Softplus (which I want to use to only get positive distance_values).
If placed at the end of
distance_regressor) –and before the
loss_dist_fn it seems that the
predicted_distance remains 0 (since the
models_output is negative.).
– But which seems for me more logic, since I want to get positive values only.
If I place it after the
loss_dist_fn the values seems to quicker adjust to positive value-predictions.
Any hint where the two functions shall be usually placed? (end of model, or after loss_fn, …)