How to normalize uint16 depth image for training?

qiminchen · September 16, 2020, 5:57am

The depth image rendered from the ScanNet dataset is in uint16, after dividing the depth by the shift (1000), the depth values are in the range of [0, 1 < some positive float < 10], how to normalize the depth to [0, 1] (per dataset) for training?

tom · September 16, 2020, 7:49am

So the question is whether you want per-dataset or per-image normalization - I would imagine that you want global because depth is a physical quantity (but in medical imaging it is not uncommon to use per-image, in particular when working with images taken by different scanners). If you need global, you’d need to iterate over the dataset once during preprocessing and find the maximum.

Best regards

Thomas

qiminchen · September 16, 2020, 8:05am

I need the per-dataset normalization. By saying iterate over the dataset and find the maximum, do you mean over the training set? or including the test set?

tom · September 16, 2020, 8:09am

The rules are that you should use the training set.
The next question then is what to do when the test set a larger depth somewhere. I’d probably just clamp it down to 1. In a good training set, I would expect the training set to max out the depth reported by the camera somewhere.

Best regards

Thomas

qiminchen · September 16, 2020, 8:16am

Thanks, Tom. So then I can normalize the depth by using (depth - min_depth) / (max_depth - min_depth) where max_depth is the maximum depth value over the whole training set, right?

tom · September 16, 2020, 9:42am

That’s what I’d do, though maybe you might just keep min_depth 0 or so.

Lakshman · May 12, 2022, 10:25am

I have a question regarding normalising depth images. How did you find the maximum depth value over the whole training set?