Is there a way to reference the cuda tensor instead of copying them to cpu?

I just want to sample some tensors in GPUs and then use it in another function(using GPUs too). Is there a way to avoid copying back to CPU? Thanks.

If the other function just uses operations which are supported on the GPU, you won’t have to push the tensor to CPU.
What is your function doing? If you are using PyTorch functions inside, it should all work without any device-to-host transfer.

My function use these data(generated from the GPUs) for Backpropagation training. I use a third lib(Ray) to generate the cuda data and when I return them to the main function, there is an error “TypeError: can’t convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.” So I think I can’t return the cuda tensor directly.

Do you need some numpy functions or would that be the usual way to convert from Ray to PyTorch tensors?
Sorry, I’m not familiar with Ray, so I would need some input on your workflow.