I use a network to get images in tensor type on GPU, but there are some postprocesses to do.
Now I’m using OpenCV with CUDA to conduct them.
cpu_numpy = gpu_tensor.byte().cpu().numpy() gpu_mat = cv2.cuda_GpuMat() gpu_mat.upload(cpu_numpy)
But it means image data is copied from GPU memory to memory and back.
Is there any method to process image tensors on GPU directly?