Sample equally-sized chunks from a 1D tensor

ygabuev · January 24, 2020, 11:38am

I have a tensor X of size (T). What I need is to randomly sample N chunks from X of size t<T, so that the result is a tensor of size (N, t) and each “row” is originally contiguous in X. What is the best way to do that (without using for-loops, allocating memory when not needed, etc.)?

nairbv · January 24, 2020, 5:16pm

if you do something like:

a=torch.arange(20)
length=5
for x in range(10):
    index=random.randint(0, a.numel()-length)
    sample=a[index:index+length]
    print(sample)

each sample shares memory with a.

I’m not sure what you mean about no loops though, what do you want to do with each sample? If you need the results to be in a single tensor (e.g. with a batch dimension), it might be harder to share memory.

ygabuev · January 24, 2020, 8:26pm

Hi @nairbv, currently I do that as follows:

source = np.random.randn(T)
samples = torch.empty(N, t)
inds = np.random.randint(0, T-t+1, N)
for i in range(N):
    ind = inds[i]
    samples[i] = torch.Tensor(source[ind:(ind+t)])

So, I am willing to take N*t space for samples tensor. The problem is that if N is large, the for-loop might take some time to execute (but I really am not sure whether it could be a problem. Maybe i’m too precautious). I also do not want to take O(Nt) memory for indices that I will use to extract samples from source.

I am wondering if there is a neat way to vectorize the operation of extracting the samples from source.

After that, each sample is passed through a neural net.