What is NFCC mentioned in the tutorial?

Continuing my journey reading the PyTorch tutorials, I’m at “AUDIO FEATURE EXTRACTIONS” page.

At the bottom of the page, there’s a section titled “Kaldi Pitch (beta)”. The code accompanying this section shows how to generate two new features, “pitch” and “nfcc”. This was my first time hearing the term “nfcc” so I had to search it online. And I must say, I could not find anything. But the paper they link to in the same section has another acronym, NCCF.

I wonder, is this a typo in the tutorial? Or there’s actually an acronym called “NFCC”? If it’s the latter case, what is “NFCC”?

One thing that makes me think that this is not a typo and that the value calculated in the tutorial is not NCCF is the fact that NCCF stands for Normalized Cross Correlation Function. And as the result, by nature, it’s a value between -1 and 1. But the tensor calculated in the tutorial is way beyond that (125 to 250).

Appreciate it if someone could help me understand what this NFCC is and how can I learn more about it. Thanks.

I think it is a typo as the docs show that torchaudio.functional.compute_kaldi_pitch returns:

Pitch feature. Shape: (batch, frames 2) where the last dimension corresponds to pitch and NCCF.

However, I don’t know why the values are apparently not normalized to [-1, 1].

1 Like