How to concatenate tensors without copy?

I am using torchlib for reinforcement learning. One thing frequently doing is to sample a batch of data from replay buffer, in which a large amount of tensor data stored on gpu, and concatenate them as a whole big tensor as input for model training.

The action “sample” generally, would copy sampled data. If it is just used as reference, then:

  1. training would be more efficient, since there is no copy;
    2.low probability of “out of memory”, since gpu memory is fixed (or pre allocated) as replay buffer size is fixed.

I wonder whether there is a way to do so. If not, would this be a good feature to post?