Efficiently move data from torch::Tensor to std::vector

In C++ I have a Tensor of shape (32, 1211, 34) on a GPU. I want to move its data to std::vector. Now I do it in this way:

typedef vector<vector<float>> EmissionT;
// ...
auto tensor = i_value_out.toTensor().to(torch::kCPU);
auto outShape = tensor.sizes();

vector<EmissionT> emissions(outShape[0], EmissionT(outShape[1], vector<float>(outShape[2])));
auto access = tensor.accessor<float, 3>();

for (size_t nBatches = 0; nBatches < outShape[0]; ++nBatches) {
    for (size_t T = 0; T < outShape[1]; ++T) {
        for (size_t N = 0; N < outShape[2]; ++N) {
            emissions[nBatches][T][N] = access[nBatches][T][N];

Is there a way to avoid copying and just move data?

No. The problem is that std::vector cannot be initialized with given data.
If you only need array-like access, you can use ArrayRef instead, but you have to take care of keeping the underlying tensor around.

Alternatively, you could allocate the vector, use torch.from_blob on that and then use an _out variant to write to that. Depending on the final op, it would just move the copying deep into PyTorch though.

Best regards