Is it possible to create a torch cuda float tensor from gpu memory?
The following code creates a numpy array and copy it to gpu memory. Now whether it is possible to create a torch tensor from gpu memory itself without copying it back to host as a numpy array. I know the example code doesn’t convey the actual purpose, in my application I do some computations on the inputs on gpu and then I would like to use one of the torch functionals on the output of the computations. Currently I copy initial results back to host and then create a torch tensor and move it back to the gpu. If I can directly create a cuda tensor directly, then I can avoid two copy operations.
from cuda import cudart, cuda import numpy as np import torch # create a numpy array of shape (6, 3, 640, 640) with random values between 0 and 1 (float32) with C order x = np.random.rand(6, 3, 640, 640).astype(np.float32, order="C") # allocate memory on the GPU x_gpu = cudart.cudaMalloc(x.nbytes) # copy the numpy array to the GPU cudart.cudaMemcpy(x_gpu, x, x.nbytes, cudart.cudaMemcpyKind.cudaMemcpyHostToDevice) # create a cuda float tensor with same size as x x_torch = torch.cuda.FloatTensor(x.size, device=torch.device('cuda'))
Will it be possible to copy the data from x_gpu to x_torch?