What is the overhead of transforming numpy to torch and vice versa?

yxchng · September 14, 2017, 2:39pm

smth · September 14, 2017, 2:41pm

probably about 1 microsecond (basically the cost of a python call). There is no memcopy or anything, so it’s quite efficient.

stsievert · September 14, 2017, 7:37pm

PyTorch tensors and NumPy arrays share the same memory locations.

I’ve compared converting to NumPy arrays from PyTorch and Tensorflow here: http://stsievert.com/blog/2017/09/07/pytorch/

On my local machine, PyTorch takes 0.5 microseconds to convert between the two.

yxchng · September 15, 2017, 1:02am

@smth @stsievert what if I convert from cuda tensor? Is that just 2 Python function call? Or gpu tensor to cpu tensor takes much longer? I have been using quite a number of conversion in my code and am wondering if it is slowing me down

stsievert · September 15, 2017, 2:58pm

The code behind these timings can be found at https://github.com/stsievert/pytorch-timing-comparisons in Jupyter notebooks. They are timing a CPU tensor to NumPy array, for both tensor flow and PyTorch.

I would expect that converting from a PyTorch GPU tensor to a ndarray is O(n) since it has to transfer all n floats from GPU memory to CPU memory. I’m not sure on the O constant, but I would expect it to be fairly small.

stsievert · September 16, 2017, 4:42am

Of course the big O constant is small – memory copies are fast.

But the big O constant is still significant. Memory is a bottleneck – CPUs spend most of their time waiting for registers to be filled, not waiting for computation to be finished.

Try to minimize the number of CPU <=> GPU transfers. I believe you want to use async=True in cuda or cpu when you can.