I would like help optimizing the following line of code, please. We are training a RNN and when profiling our code we find that our bottleneck is this line by far:
signal_seq = torch.stack([self.full_signal[idx+i:idx+i+64] for i in reversed(range(0,-120, -6))], axis=0)
full_signal is a 1D cuda.FloatTensor, and idx is a positive integer.
This line is in the get_item() method of our Dataset, and it seems to take 16s out of 18s in our tests. When executing torch.utils.bottleneck, cProfile says that all this time is consumed in the tensor() method apparently.
Is there an efficient way of doing this, please?
Thank you in advance.