I would like to build an Alexa like system for some sub sets of questions. For this I have a few questions, (1) what kind of problems would this task belong to as well as (2) whether this kind of task is feasible for a ML beginner.
- What kind of problem is this?
i) My basic idea would be to first transform the speech to text - there are probably ready-made solutions for this.
ii) Analyze / tag the text from (i) to filter out the question
E.g. from the question “What is the name of the most visited website”.
=> filter “name most visited website”
iii)To answer the filtered question (“name most visited website”), I would run it against a database full of statements and return the statement with the highest match to the question.
Here, the database could possibly contain statements like:
E.g. “Pytorch(dot)org is the most visited website”.
→ Object: Pytorch(dot)org, Statement: most visited website
and then this statement would most likely match the question from above and then you would return the object of the statement (Pytorch(dot)org).
I would like to build the statement database by parsing Wikipedia. For example, if the system is to answer questions about Machine Learning, I would first parse and save the Wikipedia page of Machine Learning and call all the links of the article and parse and save them in the same way and this until I have a certain number of articles.
I would then analyze / parse the parsed texts just like the questions above and add them to a database.
My question here would be, if this kind of approach would make sense, to what kind of problems this task may belong, so that I can better inform myself.
I’ve done a few things so far with sentiment analysis (NLP) and image recognition. Do you think such a prototypical project would be feasible with the knowledge within half a year? I currently only have an AMD R9 390 - is Google Colab enough or would I have to buy a new graphics card for this?