Trouble with YUV420 to float tensor conversion

I’m trying to use imageYUV420CenterCropToFloat32Tensor to convert an image in a YUV420 format, and I’m getting odd results. If I just try to convert the RGB tensor back directly to YUV, I get artifacts where the image seems interleaved with the image duplicated 4 times with some color changes.

I’m probably misunderstanding something about YUV420 encoding. Looking at the code here (pytorch/pytorch_vision_jni.cpp at master · pytorch/pytorch · GitHub). The y dimension seems to be halved twice, once in the (yBeforeRtn >> 1) and another time in a previous line (int uvRowStride = uRowStride >> 1). However, the x dimension isn’t halved at all.

My understanding is that both x and y dimensions are halved in 420 subsampling, so this struck me as odd. For fun, if I just take my YUV image and read the UV indices as this function does and then write back to the UV buffers as I expected the format to be, I get similar artifacts.


for (i in 0 until (imageHeight shr 1)) {
  for (j in 0 until imageWidth) {
    uBuffer.put((j / 2) * uPixelStride + i * uRowStride, uBufferOrig!!.get(j * uPixelStride + i * uRowStride / 2))
    vBuffer.put((j / 2) * vPixelStride + i * vRowStride, vBufferOrig!!.get(j * vPixelStride + i * vRowStride / 2))

The “interleaved” artifacts could point towards a layout mismatch (channels-first vs. channels-last) or a view operations, while a permute op would be necessary. Could you check, if the memory layout matches the expected one in the code?

Yeah, it took me a while to reason about what expects channels-first/channels-last. I eventually got this working by copying the c++ code from the pytorch torchvision utils file and modifying it to be the way I expected (by reducing the halving of the vertical dimension to only once, and then adding a halving of the horizontal dimension)

I can isolate the problem away from any sort of ordering of RGB channel order or even reading the RGB output. I think that the read access pattern for the U/V buffer seems off.

Here’s the code from the pytorch repo: pytorch/pytorch_vision_jni.cpp at 104b2c610b464d205e867dd136973c8bd1913ba8 · pytorch/pytorch · GitHub. I just added two lines to set 0s in the U/V buffer to indices as they’re being read: activity.kt · GitHub

On the android side, I’m just calling the function on an image. I don’t even try to use its output: activity.kt · GitHub

This image is what I get on my phone (Pixel 3XL)

. Only half the screen turns green, which is what you expect if u and v are 0. The rest of the image seems untouched, so I don’t even think the function is reading all the U/V buffer. This behavior matches what I expect if indeed the vertical dimension is being halved twice as I think the current code in the repo is doing.

Hello @laganojunior

Thanks for reporting it.

There was a bug in yuv to rgb conversion, the fix was landed in master recently [android] Fix YUV camera image to tensor by IvanKobzarev · Pull Request #50871 · pytorch/pytorch · GitHub

Next nightly build of

    implementation 'org.pytorch:pytorch_android_torchvision:1.8.0-SNAPSHOT'

will contain that fix.

@IvanKobzarev Thanks for the turnaround and fix! That does seem to be the change I implemented on my end to get it working. I’ll have to test this change out from nightly snapshot some time.

I have another issue that I think I had fixed on my end as well. When the image gets particularly dark or bright, I would see errant splotches of color. I think I see the artifacts on your image on the pull request as well, around the shadows in the right leg and the chair rest.

On my end, I had this issue fixed by wrapping the clamp0255 macro in parenthesis. E.g:

#define clamp0255(x) (x > 255 ? 255 : x < 0 ? 0 : x)

I think in this context:

outData[wr++] = (clamp0255(ri) - normMeanRm255) / normStdRm255;

The way the macro expands and the precedence of the ternary operator, I think that if ri clips below 0 or over 255, it doesn’t actually get subtracted by normMeanRm255.

1 Like

Thanks a lot for pointing it. The fix was landed to master [android] fix yuv conversion - remove define by IvanKobzarev · Pull Request #50951 · pytorch/pytorch · GitHub