How to Apply a Hamming/Hanning Window?

Hello everyone,

I’m currently working on a deep neural network that tries to locate musical onsets in audio samples. It returns a 1D tensor containing the probability of an onset at each timestamp, from 0 to 1. It’s working pretty well, but sometimes it returns a double onset where there should only be one.
Capture
(The double peak on the left should only be a single peak)

While increased training remedies the problem somewhat, it seems to be a natural consequence of the model I’m using. To fix this, I believe that I need to apply a hamming/hanning window across the outputs to smooth out these double peaks, like what was done in this paper: Dance Dance Convolution (arxiv.org).

How would I do this? I have read the PyTorch docs on the hamming window function (torch.hamming_window — PyTorch 1.13 documentation), but for some reason the explanation there isn’t very enlightening for me. Any help is much appreciated.

Thank you for your time,

-BanBot2

I haven’t read the details of this paper, but something like this might work:

x = torch.zeros(1024)
x[512] = 1.
x[525] = 1.
# add noise
x = x + torch.randn(1024) * 0.01
plt.plot(x)

filt = torch.hamming_window(32)
out = F.conv1d(x[None, :], filt[None, None, :], padding=filt.size(0)//2) / filt.sum()
plt.plot(out.squeeze(0) / out.max())

Output:
image

1 Like

That looks like just the ticket! Thanks for the help. :smile: