I have a tensor X of size (T). What I need is to randomly sample N chunks from X of size t<T, so that the result is a tensor of size (N, t) and each “row” is originally contiguous in X. What is the best way to do that (without using for-loops, allocating memory when not needed, etc.)?
if you do something like:
a=torch.arange(20)
length=5
for x in range(10):
index=random.randint(0, a.numel()-length)
sample=a[index:index+length]
print(sample)
each sample
shares memory with a
.
I’m not sure what you mean about no loops though, what do you want to do with each sample? If you need the results to be in a single tensor (e.g. with a batch dimension), it might be harder to share memory.
Hi @nairbv, currently I do that as follows:
source = np.random.randn(T)
samples = torch.empty(N, t)
inds = np.random.randint(0, T-t+1, N)
for i in range(N):
ind = inds[i]
samples[i] = torch.Tensor(source[ind:(ind+t)])
So, I am willing to take N*t space for samples
tensor. The problem is that if N is large, the for-loop might take some time to execute (but I really am not sure whether it could be a problem. Maybe i’m too precautious). I also do not want to take O(Nt) memory for indices that I will use to extract samples from source
.
I am wondering if there is a neat way to vectorize the operation of extracting the samples from source
.
After that, each sample is passed through a neural net.