Multiple output multiple task model

sougata_saha · June 19, 2020, 7:55pm

I have a network which performs regression and classification simultaneously.

class Network(nn.Module):
  def __init__(self, ...):
     ....
     self.gru=nn.GRU(input, hidden size, ...)
     self.classifier = nn.Linear(hidden size, 2)
     self.regressor = nn.Linear(hidden size, 1)

  def forward(self, input):
    output, hidden = self.gru(input)
    hidden = hidden.sum(dim=0)
    op_reg = self.regressor(hidden)
    op_class = self.classifier(hidden)

    return op_reg, op_class

I am defining 2 loss functions, MSELoss for regression and CategoricalCrossEntropy loss for classification. When I back propagate, I perform the following steps:

regression_loss = MSELoss()
classification_loss = CrossEntropyLoss()
optimizer = Adam ......
op_reg, op_class = model(input)
loss_reg = regression_loss(op_reg.squeeze(-1), Y_regression)
loss_class = classification_loss(op_class, Y_class)

loss = loss_reg + loss_class
loss.backward()
optimizer.step()

Since the scale of gradients in the cross entropy loss is going to be different from the scale of gradients from the MSE Loss, is it going to affect the training of the model, and if this is the correct way to train the model. Thanks

Scott_Hoang · June 19, 2020, 10:38pm

this looks correct. Your “model” is simply 2 models that shared a hidden layer. Your loss backpropagation should be fine.

ptrblck · June 20, 2020, 8:38am

You could use .register_hook on the model.gru parameters and check the gradient magnitude.
Both losses will accumulate the gradients in this layer, so you might need to reduce the learning rate for these parameters.