Tech Reflect Voice Log

From ESE205 Wiki
Revision as of 19:25, 4 February 2018 by Ethanshry (talk | contribs) (Feb.4 Ethan Edits)
Jump to navigation Jump to search

Jan. 22 -> Jan. 28 2018

Ethan

  • Did some testing and research into Speech-To-Text Engines
    • Cannot test snowboy hotword detection until we have a raspberry pi and hardware microphone- only runs on Linux
    • Looks like Google voice API will be best for STT, most others struggle significantly with numbers and names
  • IBM has a good Text-To-Speech engine which we will probably use for our TTS needs

Tony

  • Looked into the cost range of an appropriately sized 1 way mirror (18"-24"x24"-36")
  • Looked into cheaper monitors that could cover at least 3/4 of the mirror's surface.
  • Minor research into checking out possible designs for the exterior frame of the mirror.
  • Current findings: Mirror will most likely by 24x36 unless a significantly cheaper option (18x24) appears in the immediate future. Cost of said mirror is currently $50. I found some relatively inexpensive monitors for around $70, with a diagonal between 30-36", or enough to cover the vast majority of the interior of the mirror. The surrounding frame will probably be flat to minimize complications during the 3D printing process. Also, some sort of RGB LED, either in the form of strips or individuals in set locations, will be used somewhere on the frame, whether that is outside or inside the exterior frame, to indicate that the mirror is either listening, processing, or executing a voice command.

Jan. 29 -> Feb. 4 2018

Ethan

  • Acquired some hardware (2hr)
    • 21.5" monitor for ~$65 from microcenter
    • Google AIY kit w/ speaker and microphone for ~$10 from microcenter
  • got raspberry pi OS burned and set up to recognize microphone and speakers (1hr)
  • Ran test to confirm GPIO able to be activated via nodejs and nodejs is functional on PI (2hr)
  • Voice Recognition
    • Google AIY has a good setup for using a mic and speaker, fought with Raspbian OS image trying to configure it to be compatible with AIY but was ultimately unsuccessful, ended up re-burning the SD card with the Google AIY distribution of Raspbian (4hr)
    • Ran Google AIY demos to ensure hardware functional, got into Google Cloud API portal to configure OAuth for the Google STT services, fought to get Google Assistant hotword detection compatible with Google Cloudspeak but then determined Cloudspeak has support for expected commands and hotwords (3hr)
    • Got Python running to listen for hotword, then return speech content afterwards and to send when user stops speaking (2hr)
    • Seriously battled with Socket.io and Websockets in an effort to have persistent 2-way communication between Python and Nodejs. Turns out Socket.io and Websockets are totally different, and precious few Python libraries exist for websockets in the first place, so after several hours decided to table websockets for the moment. Currently have 1-way Python->Nodejs communication working via python requests library and Nodejs Express routing. (4hr)
  • Need to update GitHub with current code and to get a users guide for setting up AIY in case of SD faliure

Tony

  • (1hr) Created "artsy" CAD model for part of frame and the mirror. These will be used as proof of concept ideas, until a final design is selected. Then I will create the frame pieces small enough to be printed.
  • Found a raspberry pi microphone/speaker pair + an offline command recognition program



Tech Reflect Voice Project Page: https://classes.engineering.wustl.edu/ese205/core/index.php?title=Tech_Reflect_Voice