Difference between revisions of "Tech Reflect Voice"

Revision as of 22:24, 27 April 2018

Project Overview

The best creations come not from reinventing the wheel, but from integrating existing technologies in new and interesting ways. This is why when we saw the original Tech Reflect project, we realized that there was so much opportunity to improve upon it. With the increasing prevalence of IOT devices, home assistants like Google Home and Amazon Alexa, open-source and free to use APIs, and the decreasing cost of display technology, it has become possible to cheaply and easily create a piece of physical hardware for the home which can utilize the strengths of home assistants and cheap display technology, while minimizing their obtrusiveness on your life.

Group Members

Ethan Shry
Tony Sancho-Spore
Baihao Xu (Kevin)
Ellen Dai (TA)

Project Proposal

https://docs.google.com/presentation/d/1UtUwZfxM7SI90nvJo0Fx1iLGG8HB7N2jdl1ApcL0bRY/edit?usp=sharing

Objectives

We hope to construct a proof-of-concept bathroom mirror which responds to user feedback. The mirror and GUI should be somewhat aesthetically pleasing. It should listen to the user and be able to convert what they ask for into a visual response displayed on-screen. We hope to show that there is some novelty or value in integrating technology into everyday items like mirrors.

Challenges

Due to the tools available to us, ensuring that the hardware is talking to the Python listener is talking to the GUI server will be somewhat logistically challenging, especially as we will be running code in several different programming languages.

The selection of which command to take when analyzing user speech will also be a nightmare should we decide to allow multiple different trigger commands. A simple solution would only look for exact string matches, but a more robust solution will require looking into.

Our current plan for the mirror is to 3D print the frame, which due to the lack of large printers available to us needs to be done in many pieces (>10), which will be potentially infeasible.

Additionally Kevin will need to become comfortable in Pug templating language and NodeJS.

Gantt Chart

Media:GanttChartTRVoiceSpring2018.PNG

Budget

21.5" Display: $65 @ Microcenter

Google AIY Voice Kit (Includes Speaker and Microphone): $10 @ Microcenter (DISCONTINUED?)

Mirror: $50 @ Amazon (https://www.amazon.com/gp/product/B01G4MQ966/ref=oh_aui_detailpage_o00_s00?ie=UTF8&psc=1)

Raspberry Pi 3: $35

Raspberry Pi NoIR Camera: $25

Arduino Mini: $10

LED Strip: About $25 for more than you need

3 spools Standard Breadboard-y wire (22 gauge? is that a thing?): $15

Frame: $20ish for wood and screws

Paint: $5

Power Strip: $10

3D Printed Frame: $35 (approximately 1kg of PLA)

2 Power Supplies/MicroUSB Cables: $25

Epoxy: $10

Total Cost: $340 (ish)

Code

All the code for this project can be found on Github

Design And Solutions: Modules

Product Design Process Overview

Step 1: Find target users We think our target users are people between age of 20 and 40, because these people need to use their time efficiently and would like to know most up-to-date news around them every day, so our Twitter feature will keep them informed about news of the day while they are using the mirror. Also, they can know about the weather of the day and things of the day at the same time. This is a perfect product for them.

Step 2: Design our user case 1. Our user need to view interesting news around them: Twitter, Stock

2. Our user need to know what they need to do during the day: Reminder

3. Our user need to know the environment around them: Weather

4. Our user needs to have some personalized features: Timer

Step 3: Designing the main page When we design the main page, what we think is to let new users have an idea about what we are doing. For the existed user, we make them not boring when they see us every day. The way we make them not boring is to make our display as clear as possible.

Step 4: Target customers 1. Users (people between age 20 and 40)

2. Family( to install at their home)

3. School( for display purposes)

Step 5: How our target buyers prioritize their need?

1. Users: Our users will care about the utility first, which are weather, reminder, and timer. The timer will serve to control the time they make up every morning. The weather will help them decide what cloth to wear for the day. The reminder will mind them not forget anything during the day. For the Twitter and Stock, these features are optional, because they might not have time to see this, but it will be very attractive to users who like them, and make us competitive compare to our competitor.

2. Family: Family users may need similar feature to our self-buyer users. However, they might need a more powerful back-end support, because there are more information need to be saved at database, like reminder, and multi-task feature, like weather section may need to save weather of lots of city at the same time.

3. School: School user may care more about our effective of display, compare to traditional TV, we are more innovative. However, they want to make sure this can really attract students, not just regard it as a mirror. To attract school client, we have to guarantee people can easily see the information on our screen.

Step 6: Prioritize our feature 1. Reminder 2. Weather 3. Timer 4. Stock 5. Twitter

Step 7: Strategy

1. Create a UI with black background and white text, so we can have best transparency experience for our user. Then, implement all of our 5 core features.

2. Create a multi-color background. However, the transparency experience will not be so well. Also, for further development of our product in the future, there will be more requirement for our developers

3. Use white background and black text. Only implement the top 3 features, because this will save a large amount of our time.

Step 8: Based on our analysis, we choose the first one, which serve to attract all of our customers and maximize our feature and start to put effors in our work.

Our Designing Principle

Main Page: We make our main page have the background color of black, because our screen is behind our mirror, and we make every text and pictures white, and we think this is the best way to guarantee users see our screen clearly. Also, as the main page, we need to let users know what features can we offer. In the page, we list all of our features, like Twitter, reminder….. For our weather feature, we even list the logo for each icon of whether we think they are appropriate, like cloudy, raining.

Twitter: We try to make the user experience as same as Twitter, but because our screen is generally black and white only, we still make background black, and make text white. We redesigned the layout of the page and make it better fit for our screen. Actually, it is better than real twitter, because this is an ad-free version of Twitter.

Weather: In our weather section, we imported lots of icon, because we want to give users a real experience of what will the weather like today beside the text reminder. In order make every day’s UX consistence, we import the same set of UX, which makes our product seems better quality. Also if some users like other sets of weather icon, we can change based on their preference.

Reminder:The reminder page can remind people what to do every day, so the most important thing in this part is to let users really know where do they do what at which time. Our layout designed had included these parameter and make it a more user-friendly experience.

Timer: Our timer serve to help our user control the time they need every day when they use our mirror, for example, when the user just get up in the morning, their time might be valuable, we want them not miss their important thing(by using our reminder feature) and enjoy the feature offered by our mirror.

Stock: This is an optional feature for us because this only work for people who like stocks. They can view most up-to-date information about the stock by using our mirror, which may give most exciting news for them every day.

Software Overview

HTML&CSS to PUG

When we design the leayouts for our project, we just follow our 2-step procedure. First, we write our ideas in HTML. After all members of the group agree with our design, we start to translate to Pug. Pug is a high-performance template engine heavily influenced by Haml and implemented with JavaScript for Node.js and browsers. It provides the ability to write dynamic and reusable HTML documents, its an open source HTML templating language for Node.js (server-side JavaScript). So, if we use pug, after we complete our front UX design, we will use Pug to communicate with our back-end. The tutorial and syntax of Pug can be found at https://github.com/pugjs/pug.

Speech to Text

INSERT STT RESEARCH HERE After sampling different speech to text libraries, it became clear that Google Cloudspeech was vastly superior than any other solution in terms of what we wanted to achieve. While it would have been nice to run something locally like CMU Sphinx, or a totally free solution, in the end Google was easiest to integrate with and had the most accurate speech to text conversion technology.

Additionally, the use of Google Cloudspeech allowed us to use the Google AIY Voice Kit, which was cheap and made integration with google's services incredibly easy with a pre-configured distro of raspbian available for use.

What we essentially have is a two-loop system for hotword detection and then speech recognition- see the following psuedocode indicates:

while shouldBeListening:
    shouldBeListening = checkIfShouldListen()
    text = listen()
    if hotword in text:
        commandText = listen()
        sendToNodeServer(commandText)

Command Parsing

Facial Recognition

To recognize a persons face, we used Amazon's Rekognition AWS service. Rekognition is designed to work as a complete Computer Vision (CV) API, which includes facial recognition, object finding, image comparisons, etc. We also used Amazon's S3 AWS service, which is equivalent to Google Drive, except for all of the Amazon AWS API's. The complete version of our facial recognition program works as follows:

Wait for the "Switch User" command
Take a photo using the Raspi camera
Upload the picture to the S3 bucket as "UnknownFace.jpg"
Compare the picture to every other picture in the S3 bucket ("TonyFace.jpg", "EthanFace.jpg"...)
If any of the pictures match the unknown face, then we know who is currently using the mirror
Else, switch back to the default user (Future Update: be able to add users on the fly)

Front End

LED Ring

The LED Ring is a ring of 128 RGB LEDs controlled by an Arduino Uno. The Uno is connected to the Raspi 3 via UART serial connection, and to the LED strip via 1 signal pin. When the Arduino receives a mode number from the raspi corresponding to some action/command from the user, it updates the patterns for the LEDs.

Some of the modes are:

Steady on 1 color
Fade in and out 1 color
"Chasing" start up sequence
Wrap up from bottom to top once when hotword is detected
All off

The particular LED strip we're using is finicky, requiring about 3 hours and 1 blown up Arduino to get working properly with the Raspi. In order to not suffer the same fate we had, here are some of the issues we had/issues that we foresaw before they could happen:

Not sharing common ground. This was a tricky one to debug (Shoutout to Professor Feher for figuring this one out!!), as it wasn't obvious to us that the ground voltage of the Arduino and the LEDs could be different granted that they were both plugged into the same power strip.
Not placing a suitably sized capacitor on the LED strip's power leads.
Not placing an isolator between the Arduino and the LED strip (blew up an Arduino Leonardo due to this one)
Not tying the signal pin of the LEDs to ground
Not placing a small resistor on the signal pin of the Arduino connected to the LEDs

Physical Design

3D Printed Frame

Results

SubPart: Future Improvements/ How would we improve on this Project

Cost Saving: Would have been great to cut costs on this project- one major cost saving measure would have been to use a piece of glass and a reflective coating instead of a actual two way mirror. This is more technically challenging but is about half as expensive per square foot based on brief research. Also looking harder for a cheaper display would have been a great way to cut costs.

Physical Design: Spending more time on the physical design of the mirror would have been nice. The 3D printed frame is a great idea for our context (testing out many differently technologies and seeing what we can make work), but it would have been much easier to simply spend more time designing a better, lighter frame/box for our mirror and components, as opposed to spending hours CADing and printing parts.

Also would love to get a better mic/speaker onto this mirror- we got ours as part of the AIY Voice HAT Kit, which was cheap and easy, but a better speaker would definitely improve the project's functionality.

Software:

Tech Reflect Voice Log: https://classes.engineering.wustl.edu/ese205/core/index.php?title=Tech_Reflect_Voice_Log

@@ Line 137: / Line 137: @@
 === HTML&CSS to PUG===
-When we design the leayouts for our project, we just follow our 2-step procedure. First, we write our ideas in HTML. After all members of the group agree with our design, we start to translate to Pug. Pug is a high-performance template engine heavily influenced by Haml and implemented with JavaScript for Node.js and browsers. It provides the ability to write dynamic and reusable HTML documents, its an open source HTML templating language for Node.js (server-side JavaScript). So, if we use pug, after we complete our front UX design, we will use Pug to communicate with our back-end.
+When we design the leayouts for our project, we just follow our 2-step procedure. First, we write our ideas in HTML. After all members of the group agree with our design, we start to translate to Pug. Pug is a high-performance template engine heavily influenced by Haml and implemented with JavaScript for Node.js and browsers. It provides the ability to write dynamic and reusable HTML documents, its an open source HTML templating language for Node.js (server-side JavaScript). So, if we use pug, after we complete our front UX design, we will use Pug to communicate with our back-end. The tutorial and syntax of Pug can be found at https://github.com/pugjs/pug.
-Here is a sample translating process:
-<!DOCTYPE html> . => doctype html
-<html lang="en"> => html(lang="en")
-  <head> =>  head
-    <title>Pug</title> => title= pageTitle
-    <script type="text/javascript"> => script(type='text/javascript').
-      if (foo) bar(1 + 5) =>  if (foo) bar(1 + 5)
-    </script>
-  </head>
-  <body> .  =>   body
-    <h1>Pug - node template engine</h1> => h1 Pug - node template engine
-    <div id="container" class="col"> =>   #container.col
-      <p>You are amazing</p>  =>   p You are amazing
-      <p>Pug is a terse and simple templating language with a strong focus on performance and powerful features.</p> =>  p.
-        Pug is a terse and simple templating language with a
-        strong focus on performance and powerful features.
-    </div>
-  </body>
-</html>
 === Speech to Text ===