Difference between revisions of "Tech Reflect Voice"

Revision as of 03:52, 22 April 2018

Project Overview

The best creations come not from reinventing the wheel, but from integrating existing technologies in new and interesting ways. This is why when we saw the original Tech Reflect project, we realized that there was so much opportunity to improve upon it. With the increasing prevalence of IOT devices, home assistants like Google Home and Amazon Alexa, open-source and free to use APIs, and the decreasing cost of display technology, it has become possible to cheaply and easily create a piece of physical hardware for the home which can utilize the strengths of home assistants and cheap display technology, while minimizing their obtrusiveness on your life.

Group Members

Ethan Shry
Tony Sancho-Spore
Baihao Xu (Kevin)
Ellen Dai (TA)

Project Proposal

https://docs.google.com/presentation/d/1UtUwZfxM7SI90nvJo0Fx1iLGG8HB7N2jdl1ApcL0bRY/edit?usp=sharing

Objectives

We hope to construct a proof-of-concept bathroom mirror which responds to user feedback. The mirror and GUI should be somewhat aesthetically pleasing. It should listen to the user and be able to convert what they ask for into a visual response displayed on-screen. We hope to show that there is some novelty or value in integrating technology into everyday items like mirrors.

Challenges

Due to the tools available to us, ensuring that the hardware is talking to the Python listener is talking to the GUI server will be somewhat logistically challenging, especially as we will be running code in several different programming languages.

The selection of which command to take when analyzing user speech will also be a nightmare should we decide to allow multiple different trigger commands. A simple solution would only look for exact string matches, but a more robust solution will require looking into.

Our current plan for the mirror is to 3D print the frame, which due to the lack of large printers available to us needs to be done in many pieces (>10), which will be potentially infeasible.

Additionally Kevin will need to become comfortable in Pug templating language and NodeJS.

Gantt Chart

Media:GanttChartTRVoiceSpring2018.PNG

Budget

21.5" Display: $65 @ Microcenter

Google AIY Voice Kit (Includes Speaker and Microphone): $10 @ Microcenter (DISCONTINUED?)

Mirror: $50 @ Amazon (https://www.amazon.com/gp/product/B01G4MQ966/ref=oh_aui_detailpage_o00_s00?ie=UTF8&psc=1)

Raspberry Pi 3: $35

Raspberry Pi NoIR Camera: $25

Arduino Mini: $10

LED Strip: About $25 for more than you need

3 spools Standard Breadboard-y wire (22 gauge? is that a thing?): $15

Frame: $20ish for wood and screws

Paint: $5

Power Strip: $10

3D Printed Frame: $35 (approximately 1kg of PLA)

2 Power Supplies/MicroUSB Cables: $25

Epoxy: $10

Total Cost: $340 (ish)

Code

All the code for this project can be found on Github

Design And Solutions: Modules

Software Overview

Speech to Text

INSERT STT RESEARCH HERE After sampling different speech to text libraries, it became clear that Google Cloudspeech was vastly superior than any other solution in terms of what we wanted to achieve. While it would have been nice to run something locally like CMU Sphinx, or a totally free solution, in the end Google was easiest to integrate with and had the most accurate speech to text conversion technology.

Additionally, the use of Google Cloudspeech allowed us to use the Google AIY Voice Kit, which was cheap and made integration with google's services incredibly easy with a pre-configured distro of raspbian available for use.

What we essentially have is a two-loop system for hotword detection and then speech recognition- see the following psuedocode indicates:

while shouldBeListening:
    shouldBeListening = checkIfShouldListen()
    text = listen()
    if hotword in text:
        commandText = listen()
        sendToNodeServer(commandText)

Command Parsing

Facial Recognition

To recognize a persons face, we used Amazon's Rekognition AWS service. Rekognition is designed to work as a complete Computer Vision (CV) API, which includes facial recognition, object finding, image comparisons, etc. We also used Amazon's S3 AWS service, which is equivalent to Google Drive, except for all of the Amazon AWS API's. The complete version of our facial recognition program works as follows:

Wait for the "Switch User" command
Take a photo using the Raspi camera
Upload the picture to the S3 bucket as "UnknownFace.jpg"
Compare the picture to every other picture in the S3 bucket ("TonyFace.jpg", "EthanFace.jpg"...)
If any of the pictures match the unknown face, then we know who is currently using the mirror
Else, switch back to the default user (Future Update: be able to add users on the fly)

LED Ring

The LED Ring is a ring of 128 RGB LEDs controlled by an Arduino Uno. The Uno is connected to the Raspi 3 via UART serial connection, and to the LED strip via 1 signal pin. When the Arduino receives a mode number from the raspi corresponding to some action/command from the user, it updates the patterns for the LEDs.

Some of the modes are:

Steady on 1 color
Fade in and out 1 color
"Chasing" start up sequence
Wrap up from bottom to top once when hotword is detected
All off

The particular LED strip we're using is finicky, requiring about 3 hours and 1 blown up Arduino to get working properly with the Raspi. In order to not suffer the same fate we had, here are some of the issues we had/issues that we foresaw before they could happen:

Not sharing common ground. This was a tricky one to debug (Shoutout to Professor Feher for figuring this one out!!), as it wasn't obvious to us that the ground voltage of the Arduino and the LEDs could be different granted that they were both plugged into the same power strip.
Not placing a suitably sized capacitor on the LED strip's power leads.
Not placing an isolator between the Arduino and the LED strip (blew up an Arduino Leonardo due to this one :( )
Not tying the signal pin of the LEDs to ground
Not placing a small resistor on the signal pin of the Arduino connected to the LEDs

Physical Design

3D Printed Frame

Results

SubPart: Future Improvements/ How would we improve on this Project

Cost Saving: Would have been great to cut costs on this project- one major cost saving measure would have been to use a piece of glass and a reflective coating instead of a actual two way mirror. This is more technically challenging but is about half as expensive per square foot based on brief research. Also looking harder for a cheaper display would have been a great way to cut costs.

Physical Design: Spending more time on the physical design of the mirror would have been nice. The 3D printed frame is a great idea for our context (testing out many differently technologies and seeing what we can make work), but it would have been much easier to simply spend more time designing a better, lighter frame/box for our mirror and components, as opposed to spending hours CADing and printing parts.

Also would love to get a better mic/speaker onto this mirror- we got ours as part of the AIY Voice HAT Kit, which was cheap and easy, but a better speaker would definitely improve the project's functionality.

Software:

Tech Reflect Voice Log: https://classes.engineering.wustl.edu/ese205/core/index.php?title=Tech_Reflect_Voice_Log

@@ Line 90: / Line 90: @@
 === LED Ring ===
+The LED Ring is a ring of 128 RGB LEDs controlled by an Arduino Uno. The Uno is connected to the Raspi 3 via UART serial connection, and to the LED strip via 1 signal pin. When the Arduino receives a mode number from the raspi corresponding to some action/command from the user, it updates the patterns for the LEDs.
+Some of the modes are:
+*Steady on 1 color
+*Fade in and out 1 color
+*"Chasing" start up sequence
+*Wrap up from bottom to top once when hotword is detected
+*All off
+The particular LED strip we're using is finicky, requiring about 3 hours and 1 blown up Arduino to get working properly with the Raspi.
+In order to not suffer the same fate we had, here are some of the issues we had/issues that we foresaw before they could happen:
+*Not sharing common ground. This was a tricky one to debug (Shoutout to Professor Feher for figuring this one out!!), as it wasn't obvious to us that the ground voltage of the Arduino and the LEDs could be different granted that they were both plugged into the same power strip.
+*Not placing a suitably sized capacitor on the LED strip's power leads.
+*Not placing an isolator between the Arduino and the LED strip (blew up an Arduino Leonardo due to this one :( )
+*Not tying the signal pin of the LEDs to ground
+*Not placing a small resistor on the signal pin of the Arduino connected to the LEDs
 === Physical Design ===
 === 3D Printed Frame ===

Difference between revisions of "Tech Reflect Voice"

Revision as of 03:52, 22 April 2018

Contents

Project Overview

Group Members

Project Proposal

Objectives

Challenges

Gantt Chart

Budget

Code

Design And Solutions: Modules

Software Overview

Speech to Text

Command Parsing

Facial Recognition

LED Ring

Physical Design

3D Printed Frame

Results

SubPart: Future Improvements/ How would we improve on this Project

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools