Hi All,
I appreciate your help.
Looking at the source code of torchaudio.models.conformer: torchaudio.models.conformer — Torchaudio 2.4.0 documentation
I didn’t see anywhere in the Multi-head attention module the positional encoding that should be there according to the paper. Couldn’t find it in the source code of torch.nn.modules.activation.MultiheadAttention: torch.nn.modules.activation — PyTorch 2.4 documentation
Should I assume that I should create the relative positional encoding myself?
Seems kind of weird, since the positional encoding should be applied after the layerNorm according to the paper (https://arxiv.org/pdf/2005.08100)
Thanks.