Tech Reflect Voice Log

From ESE205 Wiki
Revision as of 19:10, 17 February 2018 by BaihaoXu (talk | contribs)
Jump to navigation Jump to search

Jan. 22 -> Jan. 28 2018

Ethan

  • Did some testing and research into Speech-To-Text Engines
    • Cannot test snowboy hotword detection until we have a raspberry pi and hardware microphone- only runs on Linux
    • Looks like Google voice API will be best for STT, most others struggle significantly with numbers and names
  • IBM has a good Text-To-Speech engine which we will probably use for our TTS needs

Tony

  • Looked into the cost range of an appropriately sized 1 way mirror (18"-24"x24"-36")
  • Looked into cheaper monitors that could cover at least 3/4 of the mirror's surface.
  • Minor research into checking out possible designs for the exterior frame of the mirror.
  • Current findings: Mirror will most likely by 24x36 unless a significantly cheaper option (18x24) appears in the immediate future. Cost of said mirror is currently $50. I found some relatively inexpensive monitors for around $70, with a diagonal between 30-36", or enough to cover the vast majority of the interior of the mirror. The surrounding frame will probably be flat to minimize complications during the 3D printing process. Also, some sort of RGB LED, either in the form of strips or individuals in set locations, will be used somewhere on the frame, whether that is outside or inside the exterior frame, to indicate that the mirror is either listening, processing, or executing a voice command.

Baihao(Kevin)

  • Implement some UI Design. Watch some online courses about UI/UX design, like 7+-1 principle, to draft the basic layout based on our current features.
  • Talked with Ethan about current situations of our current process and try to figure out best strategy for our UI/UX Design.
  • Continued to do all kinds of research to know how to make it the best final-project by reading "designing interface" and talked with Ethan and Tony.
  • Research some cool features


Jan. 29 -> Feb. 4 2018

Ethan

  • Acquired some hardware (2hr)
    • 21.5" monitor for ~$65 from microcenter
    • Google AIY kit w/ speaker and microphone for ~$10 from microcenter
  • got raspberry pi OS burned and set up to recognize microphone and speakers (1hr)
  • Ran test to confirm GPIO able to be activated via nodejs and nodejs is functional on PI (2hr)
  • Voice Recognition
    • Google AIY has a good setup for using a mic and speaker, fought with Raspbian OS image trying to configure it to be compatible with AIY but was ultimately unsuccessful, ended up re-burning the SD card with the Google AIY distribution of Raspbian (4hr)
    • Ran Google AIY demos to ensure hardware functional, got into Google Cloud API portal to configure OAuth for the Google STT services, fought to get Google Assistant hotword detection compatible with Google Cloudspeak but then determined Cloudspeak has support for expected commands and hotwords (3hr)
    • Got Python running to listen for hotword, then return speech content afterwards and to send when user stops speaking (2hr)
    • Seriously battled with Socket.io and Websockets in an effort to have persistent 2-way communication between Python and Nodejs. Turns out Socket.io and Websockets are totally different, and precious few Python libraries exist for websockets in the first place, so after several hours decided to table websockets for the moment. Currently have 1-way Python->Nodejs communication working via python requests library and Nodejs Express routing. (4hr)
  • Need to update GitHub with current code and to get a users guide for setting up AIY in case of SD faliure

Tony

  • (2hr) Went part shopping w/ Ethan.
  • (1hr) Created "artsy" CAD model for part of frame and the mirror. These will be used as proof of concept ideas, until a final design is selected. Then I will create the frame pieces small enough to be printed.
  • Found a raspberry pi microphone/speaker pair + an offline command recognition program

Baihao(Kevin)

  • find and study some tools which will be useful to next stage of our project, like Axure RP, Cordova.... and try to figure out which one best fit our project.
  • find and study some source code of products with similar goals on Quora and StackOverflow try to find some helpful comment which can help us avoid tricks that may happen to our project.
  • Research some cool features

Feb. 5 -> Feb. 12 2018

Ethan

  • Did some updates to the project home page (1hr)
  • Threw together a brief example for Kevin to learn Pug (2hr)
  • Configured Node to use Pug as a view engine (1hr)
  • Configured WebSockets for communication between Python HardwareInterface, NodeJS Client, and NodeJs Server (3hr)
  • Began working on framework for command selection from Text Input (4hr)
    • Currently able to do string-> string, string-> string Front & End match to grab data, and string->keyword command matching
      • Likely to be very slow at scale, should look into better implementations
  • Able to use GPIO pins through PI Hat

Tony

  • (4hrs) 3D printed two test pieces for the frame.
  • (1hr) Made minor edits to the CAD file for the corner pieces/edge pieces to account for a hole error and resizing the shell to speed up printing
  • (1hr) Made a new style of edge piece to mount a speaker in (WIP, have not made mounting pegs due to lack of measurements). Created holes for speaker in middle of face.
  • (15m) All parts have now been ordered, and should arrive by Monday, Feb. 12.
  • (1hr) Sanded down the two test pieces to account for hole/peg sizing error and failure of support brim to detach cleanly. I will have to reevaluate whether using a brim is worth the risk of slight warping on the corners versus saving several hours of sanding and an imperfect look because of the sanding.
  • (10m) Created two new assembly files to mock up the frame with the newly designed pieces. One is used as a test file, to ensure all measurements have been entered in correctly. This file consists of two different types of pieces mated together, corner-long edge, long edge-short edge, and speaker plate-long edge. The other is for getting a feel of what the final product will look like, with all 18 pieces in place.
    • Second file is a WIP, as other edge pieces have to be modified to account for microphone, motion sensor, power cords, and possible mounting brackets for arduino/raspi.
  • (2.5hr) 3D printed edge piece with speaker holes/mounting bracket. Its complex design may require sanding to ensure that the speaker fits, but watching it print right now, it seems as though it's not warping

Baihao(Kevin)

  • Start to study pug using the resources provided by Ethan.(2hr)
  • Continue to study different UI design and already find some successful template which fit to our product(2 hr)
  • Study customer feedback/comments of products like Digital Mirror Clock, Amazon Echo.... (0.5 hour)
  • will share a version of design by the end of week.

Feb. 13 -> Feb. 20 2018

Baihao(Kevin)

  • (40min) Pug is fully set up. Then start to use files given by Ethan to design the home page.
  • (10min) Based on the design of MS Windows Phone, make a toy version of the homepage.
  • (1.5 hr) make a sign up page for the first-time user. It starts from some survey questions. after they answered these questions, we can figure out how to personalize our frame for them. Like if some users may prefer constellation. Every day when they see the mirror, the mirror's homepage style will match the expectation of constellation of Calendar day. If some users want the homepage same color all the time, then it will be fixed color.
  • (3.5 hr) Designed 5 versions of homepage design. Each of them is based on different design principle. One is based on VR principle. One is based on Windows Phone. the rest of three are based on my own design.
    • Next Week: I will dive into certain features. Like layout of weather condition.....Then, each week, I will design 1 to 2 features. After the Spring Break, I will finalized each step of the design, connect them together and prepare for final demo.

Tech Reflect Voice Project Page: https://classes.engineering.wustl.edu/ese205/core/index.php?title=Tech_Reflect_Voice