Copy raw float buffer to Tensor, efficiently, without numpy

Hey all,

I’m finding that our native file buffer reader can load data about 300x faster if we can skip a numpy ndarray copy. Unfortunately, all attempts to go directly to PyTorch seem to be orders of magnitude slower than going through numpy.

Is there any way I can memcpy (from Python) my data directly into a Tensor efficiently?


Tensor:  90 seconds

Tensor Set_ + FloatStorage: 48 seconds

TorchVision loader: 7 seconds

Numpy conversion: 0.7 seconds
torch.from_numpy(numpy.asarray(buffer)) # uses numpy protocol

Native loader: 0.002 seconds

While the Numpy version is quite a nice improvement, you can see the motivation to get our raw data into tensors directly and also skip Numpy.

I’m looking for something like this C++ code, but in Python:

A similar request here:

Following that last post, this code is the fastest pure-Torch implementation I can muster, however I’m not sure if it’s leaking memory and it’s still slower than numpy:

from_buffer: 1.1 seconds

t = torch.Tensor()
s = torch.Storage.from_buffer(buffer, byte_order="native")