Hello everyone,
I am new to pytorch and have a question about grad calculating in terms about multi-task training.
For example, I need to train Classification with data label == 1 or 0; At the same time, I need to train Regression with data label == 1. The data format for label == 1 is [batch, class 1, x1, x2, y1. y2] and for label == 0 is [batch, class 0, 0].
My ouput tensor for classification is out_cls [batch, class] (like [100, 2]) and for regression is out_reg [batch, pred_x1, pred_x2, pred_y1, pred_y2] (like [100 ,4]).
I am expecting the final loss = loss_cls + loss_reg, then loss.bardword().
What am I doing now
Get the valid index for classification and regression. Get the valid tensor by out_cls = out_cls[lvalid_index], out_reg = out_reg[valid_index]. Then use these new tensor to calculate the loss, like loss_cls = nn.CrossEntrophy(out_cls, target_cls). loss_reg = nn.MSE(out_reg, target_reg).
Is this a good practice for multi-task training ? Am I successfully avoiding calculate grad for regression with data label == 0 ?
Thanks a lot.
1 Like
Seems like you might need to use .gather
rather than the []
operator? you can check with code like:
import torch
from torch import nn
from torch.autograd import Variable
batch_size = 4
out_reg1 = Variable(torch.rand(batch_size, 3), requires_grad=True)
valid_index = torch.LongTensor([1, 0, 2, 0])
print('out_reg1', out_reg1)
out_reg2 = out_reg1.gather(1, Variable(valid_index.view(-1, 1)))
print('out_reg2', out_reg2)
mse_loss = nn.MSELoss()
loss_reg = mse_loss(out_reg2, Variable(torch.rand(batch_size, 1)))
loss_reg.backward()
print('out_reg1.grad', out_reg1.grad)
Result:
out_reg1 Variable containing:
0.7507 0.2493 0.4223
0.6270 0.9968 0.5263
0.9014 0.5669 0.3850
0.3649 0.1115 0.2633
[torch.FloatTensor of size 4x3]
out_reg2 Variable containing:
0.2493
0.6270
0.3850
0.3649
[torch.FloatTensor of size 4x1]
out_reg1.grad Variable containing:
0.0000 -0.3239 0.0000
-0.0951 0.0000 0.0000
0.0000 0.0000 0.1379
0.0450 0.0000 0.0000
[torch.FloatTensor of size 4x3]
Thanks for replying. Yes I am not sure indexing using [ ] would do the job.
What should I do to make the out_reg2 to be like this:
[0.7507 0.2493 0.4223,
0.3649 0.1115 0.2633
]
?
I do need to find certain row vector, not only some scalar value. Is it possible ?
Best
Joe
Sorry for bother, I think the right method I need is torch.index_select, like below:
out_reg1 Variable containing:
0.7507 0.2493 0.4223
0.6270 0.9968 0.5263
0.9014 0.5669 0.3850
0.3649 0.1115 0.2633
[torch.FloatTensor of size 4x3]
valid_index = torch.LongTensor([0, 2])
out_reg2 = torch.index_select(out_reg1, 0, Variable(valid_index))
out_reg2 = [
0.7507 0.2493 0.4223,
0.9014 0.5669 0.3850
]
May I ask, do you think this is different from [] ? Is this gonna help me calculate the grad correctly ?
Best
Joe