How to handle grad when do multi-task training?

JoeHEZHAO · November 2, 2017, 11:54pm

Hello everyone,

I am new to pytorch and have a question about grad calculating in terms about multi-task training.

For example, I need to train Classification with data label == 1 or 0; At the same time, I need to train Regression with data label == 1. The data format for label == 1 is [batch, class 1, x1, x2, y1. y2] and for label == 0 is [batch, class 0, 0].

My ouput tensor for classification is out_cls [batch, class] (like [100, 2]) and for regression is out_reg [batch, pred_x1, pred_x2, pred_y1, pred_y2] (like [100 ,4]).

I am expecting the final loss = loss_cls + loss_reg, then loss.bardword().

What am I doing now
Get the valid index for classification and regression. Get the valid tensor by out_cls = out_cls[lvalid_index], out_reg = out_reg[valid_index]. Then use these new tensor to calculate the loss, like loss_cls = nn.CrossEntrophy(out_cls, target_cls). loss_reg = nn.MSE(out_reg, target_reg).

Is this a good practice for multi-task training ? Am I successfully avoiding calculate grad for regression with data label == 0 ？

Thanks a lot.

hughperkins · November 3, 2017, 6:04am

Seems like you might need to use .gather rather than the [] operator? you can check with code like:

import torch
from torch import nn
from torch.autograd import Variable


batch_size = 4
out_reg1 = Variable(torch.rand(batch_size, 3), requires_grad=True)
valid_index = torch.LongTensor([1, 0, 2, 0])
print('out_reg1', out_reg1)
out_reg2 = out_reg1.gather(1, Variable(valid_index.view(-1, 1)))
print('out_reg2', out_reg2)
mse_loss = nn.MSELoss()
loss_reg = mse_loss(out_reg2, Variable(torch.rand(batch_size, 1)))
loss_reg.backward()
print('out_reg1.grad', out_reg1.grad)

Result:

out_reg1 Variable containing:
 0.7507  0.2493  0.4223
 0.6270  0.9968  0.5263
 0.9014  0.5669  0.3850
 0.3649  0.1115  0.2633
[torch.FloatTensor of size 4x3]

out_reg2 Variable containing:
 0.2493
 0.6270
 0.3850
 0.3649
[torch.FloatTensor of size 4x1]

out_reg1.grad Variable containing:
 0.0000 -0.3239  0.0000
-0.0951  0.0000  0.0000
 0.0000  0.0000  0.1379
 0.0450  0.0000  0.0000
[torch.FloatTensor of size 4x3]

JoeHEZHAO · November 3, 2017, 3:28pm

Thanks for replying. Yes I am not sure indexing using [ ] would do the job.

What should I do to make the out_reg2 to be like this:
[0.7507 0.2493 0.4223,
0.3649 0.1115 0.2633
]
?

I do need to find certain row vector, not only some scalar value. Is it possible ?

Best
Joe

JoeHEZHAO · November 3, 2017, 3:56pm

Sorry for bother, I think the right method I need is torch.index_select, like below:

out_reg1 Variable containing:
0.7507 0.2493 0.4223
0.6270 0.9968 0.5263
0.9014 0.5669 0.3850
0.3649 0.1115 0.2633
[torch.FloatTensor of size 4x3]

valid_index = torch.LongTensor([0, 2])
out_reg2 = torch.index_select(out_reg1, 0, Variable(valid_index))
out_reg2 = [
0.7507 0.2493 0.4223,
0.9014 0.5669 0.3850
]

May I ask, do you think this is different from [] ? Is this gonna help me calculate the grad correctly ?

Best
Joe