How to return a boolean tensor depending on condition?

Aldebaran · April 27, 2022, 3:15pm

Let the cartesian product between a set of query_idx and another set of doc_idx as shown below:

torch.cartesian_prod(preds["query_idx"], preds["doc_idx"])
# tensor([
#     [0, 9], [0, 5], [0, 7], [1, 9], [1, 5], [1, 7], [2, 9], [2, 5], [2, 7]
# ])

Also consider the relevance map, stating which documents are relevant in relation to a given query:

relevance_map = {
    0: [5,9], # the docs 5 and 9 are relevant results concerning the query 0.
    1: [9],
    2: [5,7]
}

Then, how to generate a new tensor classifying each pair of the cartesian product as TRUE (relevant) or False (non-relevant) according to the relevance map?

torch.tensor([
    True, True, False, True, False, False, False, True, True
])

Andrei_Cristea · April 27, 2022, 3:26pm

Hi Aldebaran, you could try this:

torch.Tensor([x[1].item() in relevance_map.get(x[0].item(), []) for x in cartesian_prod]).bool()
Output:
tensor([ True,  True, False,  True, False, False, False,  True,  True])

Aldebaran · April 27, 2022, 7:27pm

It is a solution. However, it’s not differentiable because it uses the .item() operation.

Andrei_Cristea · April 27, 2022, 7:48pm

I’m not sure that this type of result is differentiable, since it’s so discontinuous. I think you’d have to define a continuous function that somehow captures how close a pair in your cartesian_prod is from being relevant (say, with 1 for totally irrelevant and 0 if in the relevant map, and somehow smooth in between) and run .sum().backward() on that. Basically I think you need to extend this discrete math problem to a continuous case, in some way, and then differentiate that.

Personally I’m not sure how to do that.

Hopefully I’m wrong and there’s a far simpler and more direct way to do this, and someone more knowledgeable here can help you out.

EDIT: Thinking about this more, one way to convert this to a differentiable problem would be to make your output a matrix shaped like (number_of_docs x number_of_queries). Its entries would be floats in the (0, 1) range. Your target would be a similarly shaped matrix whose values are either 0 or 1 depending on whether that pair of (document, query) is in the relevance map or not. You can run some type of binary cross entropy loss between these two matrices. Of course, you could run into memory problems if you have lots of docs and queries but your relevance map is very sparse, etc.

Matias_Vasquez · April 28, 2022, 11:18am

Hi,

I am assuming that by looking for a differentiable solution, you mean that the resulting tensor has require_grad = True in the end. If so, then we have a problem because you want a boolean tensor.

If we look at the documentation for autograd we can see that only floating point tensors are supported.

If this is the case, then I don´t think it is going to be possible to get what you want (a differentiable boolean tensor).

However, let me try and fail with extra steps:

Assuming

You don´t care that relevance_map is a dict

We can change it into a tensor, such that all end nodes have the same size. For this, I added -inf where the vectors are too short.

cartesian_prod = torch.tensor([[0., 9], [0, 5], [0, 7], [1, 9], [1, 5], [1, 7], [2, 9], [2, 5], [2, 7]], requires_grad=True)
relevance_map = torch.tensor([[5,9],[-float('inf'), 9],[5,7]])

Solution 1 - no grad

We get a boolean tensor, however, grad is lost when we do the equal (==) operation. If this was not the case, any will also remove it.

tmp = torch.any(relevance_map[cartesian_prod[:, 0].long()] == cartesian_prod[:, 1].unsqueeze(0).T, dim=1)
print(tmp)
print(tmp.requires_grad)

# tensor([ True,  True, False,  True, False, False, False,  True,  True])
# False

Solution 2 - not boolean (and weird format)

Here all opperations should be differentiable. The problem is that the output is a float tensor, where 0 means True and anything other than 0 is False. (as I said, weird format)

tmp2, _ = torch.abs(relevance_map[cartesian_prod[:, 0].long()] - cartesian_prod[:, 1].unsqueeze(0).T).min(dim=1)
print(tmp2)
print(tmp2.requires_grad)

# tensor([0., 0., 2., 0., 4., 2., 2., 0., 0.], grad_fn=<MinBackward0>)
# True

Solution 3 - even more unnecessary steps - still no boolean

Using tmp2 from the last solution.
Here 1 means True and 0 is False. (that´s better)

div = tmp2.clone()
div[div==0] = 1

tmp3 = -torch.div(tmp2, div) + 1
print(tmp3)
print(tmp3.requires_grad)

tensor([1., 1., 0., 1., 0., 0., 0., 1., 1.], grad_fn=<AddBackward0>)
True