I need to classify in C++ a BMP image within 20ms.
Once the file opened, I have a pointer to an array, where the RGB pixels are stored one after the other.
So I need to fill a Tensor as fast as possible from this array. Here is the code I use:
torch::Tensor image = torch::zeros({window_height, window_width, 3});
auto p = image.accessor<float, 3>();
for (int yy = origin_y; yy < origin_y + window_height; yy++) {
// unsigned char *addr_line = addr_origin + yy * width * step;
unsigned char *addr_line = addr_origin + (height - 1 - yy) * width * step;
for (int xx = origin_x; xx < origin_x + window_width; xx++) {
// 1 - Image coded in BGR, but Tensors are RGB
// 2 - BMP images are coded from bottom to top. Pixel (i, j) once flipped is then (height - 1 - i, j)
auto tensor_pixel = p[yy - origin_y][xx - origin_x];
auto array_pixel = addr_line + xx * step;
tensor_pixel[0] = *(array_pixel + 2) / 255.;
tensor_pixel[1] = *(array_pixel + 1) / 255.;
tensor_pixel[2] = *(array_pixel) / 255.;
}
}
image = torch::unsqueeze(image, 0);
I put the channel dimension in 3rd position to speed up the filling, but how can I put it back in first position before unsqueezing (for the batch dimension)? Iām looking for a function similar to āpermuteā in Python
Awesome, thanks a lot!
Would you have any tip to have a better understanding of the documentation? Iām having a hard time using the C++ backend of Libtorch, there arenāt that many examples of use (at least, not easily findableā¦).
Iām checking the C++ docs first for a matching function (for permute I knew it was there) and if I cannot find it, Iām usually checking the C++ code (the libtorch tests are a good place to see examples) for the method.
Hi, May I ask a question?
I found that tensor1.permute({0,2,1,3}); in C++ has the different value from tensor1.permute(0,2,1,3) in python3. Are there any insights on that
No, this shouldnāt be the case if you index the result tensor in the correct way.
Note that permute does not return a contiguous tensor. So check if you are trying to index the raw data without respecting the strides in C++.