Tech Reflect Voice Log

From ESE205 Wiki
Revision as of 04:25, 19 February 2018 by Ethanshry (talk | contribs) (→‎Feb. 13 -> Feb. 20 2018: Ethan Update)
Jump to navigation Jump to search

Jan. 22 -> Jan. 28 2018

Ethan

  • Did some testing and research into Speech-To-Text Engines
    • Cannot test snowboy hotword detection until we have a raspberry pi and hardware microphone- only runs on Linux
    • Looks like Google voice API will be best for STT, most others struggle significantly with numbers and names
  • IBM has a good Text-To-Speech engine which we will probably use for our TTS needs

Tony

  • Looked into the cost range of an appropriately sized 1 way mirror (18"-24"x24"-36")
  • Looked into cheaper monitors that could cover at least 3/4 of the mirror's surface.
  • Minor research into checking out possible designs for the exterior frame of the mirror.
  • Current findings: Mirror will most likely by 24x36 unless a significantly cheaper option (18x24) appears in the immediate future. Cost of said mirror is currently $50. I found some relatively inexpensive monitors for around $70, with a diagonal between 30-36", or enough to cover the vast majority of the interior of the mirror. The surrounding frame will probably be flat to minimize complications during the 3D printing process. Also, some sort of RGB LED, either in the form of strips or individuals in set locations, will be used somewhere on the frame, whether that is outside or inside the exterior frame, to indicate that the mirror is either listening, processing, or executing a voice command.

Baihao(Kevin)

  • Implement some UI Design. Watch some online courses about UI/UX design, like 7+-1 principle, to draft the basic layout based on our current features.
  • Talked with Ethan about current situations of our current process and try to figure out best strategy for our UI/UX Design.
  • Continued to do all kinds of research to know how to make it the best final-project by reading "designing interface" and talked with Ethan and Tony.
  • Research some cool features


Jan. 29 -> Feb. 4 2018

Ethan

  • Acquired some hardware (2hr)
    • 21.5" monitor for ~$65 from microcenter
    • Google AIY kit w/ speaker and microphone for ~$10 from microcenter
  • got raspberry pi OS burned and set up to recognize microphone and speakers (1hr)
  • Ran test to confirm GPIO able to be activated via nodejs and nodejs is functional on PI (2hr)
  • Voice Recognition
    • Google AIY has a good setup for using a mic and speaker, fought with Raspbian OS image trying to configure it to be compatible with AIY but was ultimately unsuccessful, ended up re-burning the SD card with the Google AIY distribution of Raspbian (4hr)
    • Ran Google AIY demos to ensure hardware functional, got into Google Cloud API portal to configure OAuth for the Google STT services, fought to get Google Assistant hotword detection compatible with Google Cloudspeak but then determined Cloudspeak has support for expected commands and hotwords (3hr)
    • Got Python running to listen for hotword, then return speech content afterwards and to send when user stops speaking (2hr)
    • Seriously battled with Socket.io and Websockets in an effort to have persistent 2-way communication between Python and Nodejs. Turns out Socket.io and Websockets are totally different, and precious few Python libraries exist for websockets in the first place, so after several hours decided to table websockets for the moment. Currently have 1-way Python->Nodejs communication working via python requests library and Nodejs Express routing. (4hr)
  • Need to update GitHub with current code and to get a users guide for setting up AIY in case of SD faliure

Tony

  • (2hr) Went part shopping w/ Ethan.
  • (1hr) Created "artsy" CAD model for part of frame and the mirror. These will be used as proof of concept ideas, until a final design is selected. Then I will create the frame pieces small enough to be printed.
  • Found a raspberry pi microphone/speaker pair + an offline command recognition program

Baihao(Kevin)

  • find and study some tools which will be useful to next stage of our project, like Axure RP, Cordova.... and try to figure out which one best fit our project.
  • find and study some source code of products with similar goals on Quora and StackOverflow try to find some helpful comment which can help us avoid tricks that may happen to our project.
  • Research some cool features

Feb. 5 -> Feb. 12 2018

Ethan

  • Did some updates to the project home page (1hr)
  • Threw together a brief example for Kevin to learn Pug (2hr)
  • Configured Node to use Pug as a view engine (1hr)
  • Configured WebSockets for communication between Python HardwareInterface, NodeJS Client, and NodeJs Server (3hr)
  • Began working on framework for command selection from Text Input (4hr)
    • Currently able to do string-> string, string-> string Front & End match to grab data, and string->keyword command matching
      • Likely to be very slow at scale, should look into better implementations
  • Able to use GPIO pins through PI Hat

Tony

  • (4hrs) 3D printed two test pieces for the frame.
  • (1hr) Made minor edits to the CAD file for the corner pieces/edge pieces to account for a hole error and resizing the shell to speed up printing
  • (1hr) Made a new style of edge piece to mount a speaker in (WIP, have not made mounting pegs due to lack of measurements). Created holes for speaker in middle of face.
  • (15m) All parts have now been ordered, and should arrive by Monday, Feb. 12.
  • (1hr) Sanded down the two test pieces to account for hole/peg sizing error and failure of support brim to detach cleanly. I will have to reevaluate whether using a brim is worth the risk of slight warping on the corners versus saving several hours of sanding and an imperfect look because of the sanding.
  • (10m) Created two new assembly files to mock up the frame with the newly designed pieces. One is used as a test file, to ensure all measurements have been entered in correctly. This file consists of two different types of pieces mated together, corner-long edge, long edge-short edge, and speaker plate-long edge. The other is for getting a feel of what the final product will look like, with all 18 pieces in place.
    • Second file is a WIP, as other edge pieces have to be modified to account for microphone, motion sensor, power cords, and possible mounting brackets for arduino/raspi.
  • (2.5hr) 3D printed edge piece with speaker holes/mounting bracket. Its complex design may require sanding to ensure that the speaker fits, but watching it print right now, it seems as though it's not warping

Baihao(Kevin)

  • Start to study pug using the resources provided by Ethan.(2hr)
  • Continue to study different UI design and already find some successful template which fit to our product(2 hr)
  • Study customer feedback/comments of products like Digital Mirror Clock, Amazon Echo.... (0.5 hour)
  • will share a version of design by the end of week.

Feb. 13 -> Feb. 20 2018

Baihao(Kevin)

  • (40min) Pug is fully set up. Then start to use files given by Ethan to design the home page.
  • (10min) Based on the design of MS Windows Phone, make a toy version of the homepage.
  • (1.5 hr) make a sign up page for the first-time user. It starts from some survey questions. after they answered these questions, we can figure out how to personalize our frame for them. Like if some users may prefer constellation. Every day when they see the mirror, the mirror's homepage style will match the expectation of constellation of Calendar day. If some users want the homepage same color all the time, then it will be fixed color.
  • (3.5 hr) Designed 5 versions of homepage design. Each of them is based on different design principle. One is based on VR principle. One is based on Windows Phone. the rest of three are based on my own design.
    • Next Week: I will dive into certain features. Like layout of weather condition.....Then, each week, I will design 1 to 2 features. After the Spring Break, I will finalized each step of the design, connect them together and prepare for final demo.

Tony

  • (4hr) Finished printing the pieces for the bottom of the frame. We verified that the mirror (which arrived this week) fits in the frame with a low enough tolerance to not slide out. I may have to sand down some of the parts to reduce the tolerance, however this will be decided upon closer to final assembly
  • (1hr) Ethan and I discussed the back portion of the frame. We decided upon a wooden back, and to service the mirror, the top edge of the frame is detachable from the rest of the frame.
  • (2hr) Redesigned some pieces to make them fit together more smoothly. Also created the new part files for the side pieces, which have a tab for LED strips and holes for cabling. After I made those, I updated the main assembly file to reflect the new changes. So far, there are no conflicts between parts. The last few parts I need to design are: the top piece with the microphone slot, and the piece with the hole for the motion sensor.
  • (15m) I got bored in my CAD class, so I found a lion STL file online, and did some magic computer stuff to make it more decorative, lie flat, and fit on our mirror. If we have time, I will print 2 of these for the top of the frame.
  • Possible upcoming problems: the 2 most recent parts have had significant warping at the corners due to a fault with the printer in the CAD lab. Hopefully, by switching to the printers in the Urbauer lab, these parts can be printed more quickly and with less warping. I prefer not to use a "brim" around the part, as I tried this with the first 2 corner pieces, and I spent close to an hour and a half trying to remove them and all their residual PLA from the parts. Also, the 6x4 will not fit on the printer with the brim.

Ethan

  • (1hr) Meeting w/ Kevin to familiarize with Pug
  • (1hr) API research, found potential stock API and confirmed viability of weather and Twitter APIs
  • (3hr) Work on voice commands and structure of code, exploratory look into recording hotword detection and callback on hotword pickup
  • (1hr) Meeting w/ Kevin to go over design specs
  • Aside- feature locking for the moment, will come back and look into other features after current plans are fully implemented

Tech Reflect Voice Project Page: https://classes.engineering.wustl.edu/ese205/core/index.php?title=Tech_Reflect_Voice