Variable Length Reduce

I’m looking for an efficient way to do a reduce that looks like this without using the for loop because the for loop is WAY too slow. Any ideas on how to do this variable length reduce?

Input: I (bool) [M,1] , Start (int) [N,1], Stop (int) [N,1] -> Output: O (bool) [N,1]

for n in range(N):
    O[n] = torch.all(I[Start[n]:Stop[n]])

For now I’ve found this to work OK.

s = torch.cumsum(I, dim=0)
x = s.index_select(0,start)
y = s.index_select(0,stop)
O = ((y-x) == (stop-start))