Anyone working with a data pipeline of CPU -> GPU? I am developing a library of methods for faster transfer to GPU. In some cases, 370x faster than used Pytorch's Pinned CPU Tensors

Let me know what your pipeline is and I’ll try to add methods for it. Just show me your code.

I am developing methods for fast transfer from CPU and GPU, and currently coding the methods for it. Show me your code (A Colab notebook would be really helpful) and I’ll see how to incorporate the library into it, for faster data transfer.

@Santosh-Gupta Hi, thanks for your work and proposal. Actually I am very interested on tools to speed up CPU-GPU-CPU data transfers. I have written the details of my application in this post. There you will find details on what I tried and access to the code.

Basically, im trying to do real-time control @100hz on constrained devices (Jetson Nano) using pytorch for computation of the control laws on GPU. I am sure there are several bottlenecks but one of them is the CPU-GPU and GPU-CPU data transfers.

I am starting to rewrite the code to use torch JIT, but I am not quite sure about the speed up improvements. I will appreciate your support on trying out SpeedTorch (of which I read about on your post) and hopefully get some nice performance increase.

@juanmed it’s out