Writing WARP Loss layer

varunagrawal · June 5, 2017, 6:17pm

Hey,

I’ve been trying to implement the Weighted Approximate Pairwise Ranking Loss (WARPLoss) from https://arxiv.org/pdf/1312.4894.pdf and wanted to check with folks here if my implementation is correct since I am can’t seem to find a solid resource on writing custom layers in PyTorch.

Here’s the code:

class WARPLoss(loss.Module):
def __init__(self, num_labels=204):
    super(WARPLoss, self).__init__()
    self.rank_weights = [1.0/1]
    for i in range(1, num_labels):
        self.rank_weights.append(self.rank_weights[i-1] + (1.0/i+1))

def forward(self, input, target):
    """
    
    :param input: Deep features tensor Variable of size batch x n_attrs. 
    :param target: Ground truth tensor Variable of size batch x n_attrs.
    :return: 
    """
    batch_size = target.size()[0]
    n_labels = target.size()[1]
    max_num_trials = n_labels - 1
    loss = 0.0

    for i in range(batch_size):

        for j in range(n_labels):
            if target[i, j] == 1:

                neg_labels_idx = np.array([idx for idx, v in enumerate(target[i, :]) if v == 0])
                neg_idx = np.random.choice(neg_labels_idx, replace=False)
                sample_score_margin = 1 - input[i, j] + input[i, neg_idx]
                num_trials = 0

                while sample_score_margin < 0 and num_trials < max_num_trials:
                    neg_idx = np.random.choice(neg_labels_idx, replace=False)
                    num_trials += 1
                    sample_score_margin = 1 - input[i, j] + input[i, neg_idx]

                r_j = np.floor(max_num_trials / num_trials)
                weight = self.rank_weights[r_j]

                for k in range(n_labels):
                    if target[i, k] == 0:
                        score_margin = 1 - input[i, j] + input[i, k]
                        loss += (weight * torch.clamp(score_margin, min=0.0))

    return loss

Would autograd work well on this code or am I doing something wrong? I can even try writing the backwards pass if that makes more sense.

UPDATE
Edited the code so that Pytorch computes the right value without complaining.

varunagrawal · June 7, 2017, 5:09pm

Still hoping for some help here.

smth · June 21, 2017, 11:43pm

Everything looks good, the only line that’s weird is:

if target[i, j] == 1:

If target is a Variable, this should error out in the next release. The way to do this is: if target.data[i, j] == 1

varunagrawal · June 21, 2017, 11:58pm

Awesome. Thanks for the help @smth. I’m still on 0.1.12 but will keep this in mind for the upcoming release. Good luck with that!

jphoward · July 2, 2017, 8:48pm

@varunagrawal I’m interested in this too - did you get it working? If so, did you make any changes to the code above?

varunagrawal · July 6, 2017, 11:44pm

Got it working. I updated the code above to match what I am using. The training is super slow though so I may end up writing the backwards pass as well.

danieljf24 · August 11, 2017, 2:30am

Hi @varunagrawal , Thanks for your shared code. I try to use your shared code, but training is also really slow as you said. Did you speed up it? Looking forward to your reply.

mkula · August 13, 2017, 11:17am

Hey @varunagrawal — I’ve got an approximation to the WARP loss implemented in my package. The loss definition itself is here; you can see it in use here.

In implementing it, I’ve made some concessions to the minibatch nature of PyTorch operation. Instead of sampling negatives and taking the first one that violates the ranking, I sample a fixed number of negatives at every step and take the maximum of the loss value for every observation in the minibatch. I have omitted adding the approximate rank weighting, but I’m sure you could add it by, say, taking the count of rank-violating negatives for every observation; the key to the rank term is that it approximates how hard it is for the model to get the ranking wrong for any observation.