NL dimension in Kornia's ScalePyramid

sunder · December 29, 2020, 10:25am

I want to use a scale pyramid to extract resize invariant features for an image similarity project. I’m using resnet18 trained with triplet loss. I want to extract features at multiple scales and average pool them.

But when I pass the image tensor through kornia.geometry.transform ScalePyramid, it is returning images with shape B,C,NL,H,W.

I’m not sure how to understand the NL dimension. I need tensors in the shape B,C,H,W. How can I drop this NL dimension?

I also want to train a network that is invariant to scale. Can I use this scale pyramid during training? If yes, how to accommodate the NL dimension.

ptrblck · January 7, 2021, 9:39am

I assume NL is the number of scale levels, but cannot find the exact definition in the docs.
CC @edgarriba, who would know it

ducha-aiki · January 15, 2021, 4:34pm

NL is number of scale levels of the scale space, see SIFT paper, Figure 1 on the left. This mean to consequently blur the image more and more, which for linear filters is equivalent to resizing.

If you just need a pyramid, which contains HW, H/2 W/2, H/4 W/4 and so on - use kornia.geometry.transform — Kornia documentation function instead. It does not have levels unlike scale-space pyramid, that is a mistake in documentation (fixed in master)