Audio data, True/False Large Data Set; Best Methods?

paw · January 17, 2023, 6:16am

hello all.
I need to construct a ML nn to find a boolian answer to large amounts of data.

each 20 hour block only has 1 answer True or False

there will be access to over 6,480 blocks of 20 hours each with labels of True or Flase (currently working on prepping that data now)

will running it similar to the examples I found of the yes, no problem show me all the relationships possible between all the True’s and all the False’s data or will I need to add more plug-ins/functions to find the relationships?

here is the type of example I’m starting with below, but please let me know if a different idea/method is better to build on.

thanks so much all, look forward to going on this journey all together.

shivammehta007 · February 3, 2023, 8:11am

This looks like a good starting point but, maybe you would want to play around with some sort of downsampled representation of the audio signal. As much as I understand you have a 20-hour audio and a binary label for that audio. It looks like a very computation-heavy task you can try some other pretrained representations of the audio, like wav2vec2 vectors or even whisper’s embeddings ( depending on your problem) and do some sort of pooling on them or something to downsample the large audio waveform to something shorter that you can compute and train with.