Is there any faster way to reshape a Variable containing a cuda Tensor?

I need to reshape an Variable (named as W) containing a cuda Tensor. I failed to use the transpose function offered by torch because i need to change the order of each element by a pre-defined method. So, I use the function as follows:[W[idx] for idx in p],0)
in which p is an array containing the new order.

It turns to be very slow. Is it possible to be speeded up?

You can use the .permute() function (doc here).
PW = W.permute(*p)

sorry, I didn’t describe my problem clear. W is a Variable containing a torch.Cuda.FloatTensor of size 100, for example. For simplicity let us say W contains 100 elements, i.e., W[0],W[1] …and W[99]. Now I need to produce a new Variable, which contains elements from W, but in a totally different order. Thank you.

One observation: allocation in cuda is a sync point, which will be slow. If you can somehow create the permuted variable once, before any loop, and simply copy the permuted values across, without allocation, each iteration, that might speed things up?

Then PW = W.index_select(0, p), where, if W is a torch.cuda.XXXXTensor(), p has to be a torch.cuda.LongTensor.
If you already have some memory allocated for PW, you can use torch.index_select(W, 0, p, out=PW).

It works. A lot thanks!