Hi,

I am currently wondering how could we make a efficient implementation of LCS problem.

I found a way to find consecutive match (i.e. ngrams match) with tensor operation by comparing and shifting.

With two sequences x (len: n), y (len: m), the matrix:

`e = x.eq(y.unsqueeze(1)) # [n x m]`

we have: `e[i, j] == 1 <=> x[i] == y[j]`

, a n-gram match will be visible as a diagonal of ones.

Thus we can do the following:

```
# match_1 = [n x m] of {0, 1}
match_1 = x.eq(y.unsqueeze(1))
# match_2 = [(n-1) x (m-1)] matrix of {0, 1}
match_2 = match_1[:-1, :-1] * match_1[1:, 1:]
# etcetc
```

The LCS problem is more complicated as we allow gaps. It can be implemented using dynamic programing, in O(n x m), but it is not really protable to Pytorch right?

I tried, itâ€™s super slow.

```
# considering two "sentences" (LongTensors of word indices)
# _ts, _tr of respective length n and m
table = torch.zeros(n+1, m+1)
_ts, _tr = ts, tr
for i in range(1, n+1):
for j in range(1, m+1):
if _ts[i-1] == _tr[j-1]:
_table[i, j] = _table[i-1, j-1] + 1
else:
_table[i, j] = max(_table[i-1, j], _table[i, j-1])
lcs = _table[n][m]
```

Any idea to make it more efficient?