The documentation says that the input is a “Gaussian 5d input” with dimensions [B,C,D,H,W]. It is not clear to me how to get this representation and what it means exactly.
I’m trying to extract patches to use with HardNet.
Based on reading the SIFT paper I think I need to combine the following functions:
- ScalePyramid() to get the different octaves/levels
- dog_response - which subtracts neighbouring level in each octave
- Non maximum suppression - to get the feature points
- extract_patches_from_pyramid() - to get the local patches to be fed to HardNet
I’m not sure how to combine these functions since the inputs/outputs don’t seem the straightforwardly match.
An example demonstrating this use case would be much appreciated.
I know about this example which shows how to extract the patches around keypoints: https://github.com/kornia/kornia/blob/master/examples/feature_detection/extract_local_patches.ipynb