My problem is that I have a mask/tensor of shape [B] and want to multiply it with a tensor of shape [B, 3] like this:
[1, 0] * [ [1,2,3] , [4, 5, 6] ] = [ [1 * 1, 1 * 2, 1 * 3], [0 * 4, 0 *5 , 0 * 6] ] = [ [1, 2, 3], [0, 0, 0] ]
How do I multiply such a mask of shape [Batchsize] with my positions of shape [Batchsize, 3]?
In my implementation, the multiplication is understood as dot product, which raises a “RuntimeError: The size of tensor a (3) must match the size of tensor b (2) at non-singleton dimension 1” (Batchsize=2)
I’m trying to implement a custom loss function that combines classification and (conditional) regression loss similar to this post Loss function conditional on two outputs.
In my case, the model has 5 outputs. Two should refer to the class and the other three refer to a position.
I want to add the regression loss only if the target has a specific class.
I tried to adopt his approach to use
(target_class == 1).float() as a mask. Which results in the error mentioned above.
class LocateLoss(nn.Module): def __init__(self): super(LocateLoss, self).__init__() self.cross_entropy_loss = nn.CrossEntropyLoss() self.b_cross_entropy_loss = nn.BCEWithLogitsLoss() self.soft_max = nn.Softmax(dim=1) def forward(self, input, target): input_class, input_loc = torch.split(input, [2, 3], dim=1) input_class = self.soft_max(input_class) target_class, target_loc = torch.split(target, [1, 3], dim=1) target_class = target_class.reshape(1, -1).squeeze() input_loc = input_loc * (target_class == 1).float() l1 = self.cross_entropy_loss(input_class, target_class) l2 = self.b_cross_entropy_loss(input_loc, target_loc) return l1 + l2