Referencing this: https://github.com/pytorch/audio/tree/main/examples/avsr
Has anyone ever gotten AVSR to function with the PytorchAudio AVSR Examples?
I have been trying for a few days to get an Av_Hubert AVSR model functioning for inference like the ones shown here: https://facebookresearch.github.io/av_hubert/
for audio-visual speech recognition with a video and audio stream, however, the system provided on Pytorch does not appear to have any support for running inference on these models.
Any guidance to get an AVSR pre-trained model functioning would be great, as there does not appear to be a lot of support for such a system.