I don’t know if I need nlp for a problem I have with a large audio data set.
let me explain why.
I only need to find all relationships between a specific outcome that is influenced by 20 hours of audio data.
The words in english, spanish, french DO mater, and they DO influence the results.
my cost to run will approach smaller and smaller cost as I will be constructing my own data lab in the next 5 years.
getting it most accurate is the max concern and want to learn as much about the design and ideas as possible.
would running a find all relationships method a-sync on a massive data center be the right answer instead of translating it to words.
I am a novice at this. I will be working with you guys on here a lot for a while I hope.
I want to make sure I start the design the right way to expand the training module with more and more data, but I want ALL relationships, even ones I can’t understand like the tone and inflection in the voice.
(may cut it to mono and low bit rate to run it quicker?)
basically would I not need to train the model on how to “understand” the language, only that this data made this result to get the max amount of relationships possible to find?
look forward to chatting a lot!
thanks all in advance.
For binary problems, you just put 1 for the out_features of the final layer, then run it through an nn.Sigmoid() activation layer. Finally, run that and your target through nn.BCELoss() as your loss function.
As far as model structure before that and whether or not to use a pre-trained NLP model, you will likely need to experiment with your data and various setups to make a determination of what is the most suitable method for finding a best fit.
As an aside, you will almost always (always, as far as I know) be better off
using BCEWithLogitsLoss without the Sigmoid (instead of BCELoss). The
two approaches are mathematically equivalent, but BCEWithLogitsLoss
is better behave numerically (the core problem being that Sigmoid can easily
I will try exploring both after data prep! I have been trying to prep my data right now and the examples I’ve seen seem to have the flags of what the data is as the name of the files? If I want to use the most types of binary transforms do I try to put them in separate folders or make the names of the files simply true.mp3 or false.mp3 ?
To use the most ai methods.
also, does each sample need to be same length quality and bit rate to not confuse model I would guess but have no technical background in this and if I can save myself a few months prep time by using miss mached data (still all labled true or false but of different quality, length etc) its okay if system cant handel that just trying to make sure I’m not wasting time trying to make them all same .