I have a network which performs regression and classification simultaneously.
class Network(nn.Module):
def __init__(self, ...):
....
self.gru=nn.GRU(input, hidden size, ...)
self.classifier = nn.Linear(hidden size, 2)
self.regressor = nn.Linear(hidden size, 1)
def forward(self, input):
output, hidden = self.gru(input)
hidden = hidden.sum(dim=0)
op_reg = self.regressor(hidden)
op_class = self.classifier(hidden)
return op_reg, op_class
I am defining 2 loss functions, MSELoss for regression and CategoricalCrossEntropy loss for classification. When I back propagate, I perform the following steps:
regression_loss = MSELoss()
classification_loss = CrossEntropyLoss()
optimizer = Adam ......
op_reg, op_class = model(input)
loss_reg = regression_loss(op_reg.squeeze(-1), Y_regression)
loss_class = classification_loss(op_class, Y_class)
loss = loss_reg + loss_class
loss.backward()
optimizer.step()
Since the scale of gradients in the cross entropy loss is going to be different from the scale of gradients from the MSE Loss, is it going to affect the training of the model, and if this is the correct way to train the model. Thanks