This is a personal assistant project for KIDS. It is similar to Alexa and Google home. The main aim of the project is to answer why, when, where and how kind of questions kids ask. This will also helps linguistically. It can tell the meaning of a word and spell a word too.
Hardware:- Raspberry PI or other similar IOT boards which supports windows 10.
- Good microphone (Check Microsoft windows 10 IOT compatible)
- Good speakers
Software:
- UWP app
- Google cloud speech
LUIS:
1) Create a LUIS model to recognize wiki entities and regions.
2) Create custom entities to identify what user is asking about.
Example: tell me about Donald Trump
In the above example, LUIS will recognize the intent as "educational" and entity as "Donald Trump".
3) We need to train LUIS with some utterances.
4) Now from the code, send the question to LUIS and it will reply with a json includes intent and entity. Now the if the intent is our custom educational intent, then grab the entities and send that to wiki API to get the summary.
5) If summary is all good, then read out it for the user using Microsoft speech synthesis.
API.AI:
1) This is also very similar to LUIS. Create a new app and create a custom model to identify the entities.
2) I have created a custom model and imported all the dictionary words into it.
3) Created to new intents "word_meaning" and "spell_word".
Example: what is the meaning of analysis.
In the above example, the intent is "word_meaning" and entity is "analysis".
4) Now from the code, after LUIS checkup send the query to this API and it will return the intent and entity. Once you get the desired intent then send the entity to Oxford dictionary API to get the meaning of the word.
5) If all good, then read out the meaning.
Microsoft speech recognizer:I have used this component for continuous listening. We need to add a custom word to dictionary. This will be our app wake up word. If the word matches, then record rest of the speech using MediaCapture element. This will save the audio in memory stream.
Google cloud speech:
After the listening completed, send the audio to google in FLAC format. At the moment, UWP is not supporting FLAC codec. So I have created a new custom API to convert WAV to FLAC format and send it to google speech API. And returning back the text to the UWP app.
Flow chart:
Comments