I have a big 1D tensor
A, which contains around 20M elements. I also have some spans with unequal lengths, i.e.,
B=[(s_1, e_1), (s_2, e_2), ..., (s_n, e_n)], where
n may be more than 8K. The one-time slicing
A[s:e] is very fast, but slicing for all spans in
for loop is very time consuming. Is there any way to slice parallelly on gpu? My torch version is 1.8.1, and some operations like
apply_() are only available on CPU.
A = torch.tensor([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) B = torch.tensor([[0, 3], [6, 8]]) C = UnkownFuction(A, B)
C is also a 1D tensor tensor
[1, 2, 3, 4, 7, 8, 9]
Thanks for your kind help in advance!