I have a list of sentences and I am looking to extract contents between two items.
If the start or end item does not exist, I want it to return a row with padding only.
I already have the sentences tokenized and padded with 0 to a fixed length.
I figured a way to do this using for loops, but it is extremely slow, so would like to
know what is the best way to solve this.
import torch
start_value, end_value = 4,9
data = torch.tensor([
[3,4,7,8,9,2,0,0,0,0],
[1,5,3,4,7,2,8,9,10,0],
[3,4,7,8,10,0,0,0,0,0], # does not contain end value
[3,7,5,9,2,0,0,0,0,0], # does not contain start value
])
#expected output
[
[7,8,0,0,0,0,0,0,0,0],
[7,2,8,0,0,0,0,0,0,0],
[0,0,0,0,0,0,0,0,0,0],
[0,0,0,0,0,0,0,0,0,0],
]