FingerSpark Weekly Log

From ESE205 Wiki
Revision as of 00:53, 31 March 2016 by Dbattel (talk | contribs)
Jump to navigation Jump to search

Week of 2/8

  • Chose name for Project
  • Completed Project Proposal
  • Identified essential components of product and incorporated these components into the budget
  • Finalized Gantt Chart
  • Created Wiki Page for Project


Week of 2/15

  • Updated Gantt Chart
  • Researched prices of items on budget: The results are uploaded in the budget section of this project.
  • Discussed the basis of what algorithms we wanted to implement: There may be premade algorithms to find the centroid of color blobs, but if we can't implement one, we could instead test each pixel in the image for whether it falls within our thresholds of color, then take the average of the coordinates of all of these points (which should put us somewhere close to the center of the spot of color).
  • Researched camera specs: We discovered that the PiCamera allows you to take video at 1080p at 30FPS and 720p at 30FPS, as well as still images, and has a native resolution of 5 megapixels. We do not know if this is enough resolution yet, however.
  • We researched common methods of color-finding, and learned that most algorithms that attempt to identify colors in a human-like way use HSV encoding rather than RGB encoding. Accordingly, we now plan to do all of our analysis and thresholding in HSV.


Week of 2/22

  • Updated Project Wiki
  • Updated Project Budget
  • Researched OpenCV: It turns out that OpenCV has many libraries that contain operations we may need, but 1) the documentation is often incomplete or above our level and 2) many functions are implemented tens of times with minor differences in semantics and nuance. We settled on a sublibrary called ImgProc, which seems to contain the blob-detection and color-scanning algorithms we were planning to use.
  • As of last week, we did not know whether the camera had high enough resolution to achieve the tasks we had laid out as critical for this project. This week, we tested a laptop camera with similar specs to the PiCamera to determine at what distance the latter will be able to clearly distinguish different fingers. The result was that at three meters away from the camera, the fingers of a hand are clearly distinguishable to the human eye.


Week of 2/29

  • Continued researched into possible OpenCV sublibraries to use for color point detection (especially further testing with imgproc)
  • Loaded OpenCV onto Raspberry Pi (using Python 2, rather than Python 3 as planned; this took most of our time this week unfortunately)
  • Configured OpenCV to be accessible to Python programs on Raspberry Pi (it took a while to realize that it imports as "cv2", not "opencv")

Week of 3/7

  • This week was midterms so not very much happened on this project, unfortunately. However, we discussed algorithms further and found several ways to skip processing time (for example, checking only every 5th pixel for thresholding, then zooming in on areas with the correct colors to scan further) to avoid having to check millions of points.

Week of 3/14

  • Since we didn't work very much last week due to midterm exams, David took home the Raspberry Pi over spring break and continued to work on getting useful images from the camera. We are able to take video and photos and save them to the desktop or load them into our program; however, the still images have significant blurring even with slow hand motion speeds (not a problem we had anticipated, but one that might be solved by reducing the image's exposure time). The videos are also in the .h264 format, which unfortunately cannot be read by the RaspberryPi or OpenCV. David downloaded a shell script (runnable from python) to convert the movies to .mp4 format, but it still takes too long to be practical at the moment. We have code to send the live video feed from the camera into OpenCV as a "stream" object, but we do not yet understand this code well enough to implement or modify it. That's one of our goals for next week.

Week of 3/21

  • We started implementing the information we'd researched a few weeks ago in our explorations of OpenCV. In particular, at the moment we have somewhat abandoned blob-recognition (too complex and slow for what we need) and instead we are adapting a certain approach to color-finding. A common method of highlighting just a single color range in an image is to generate a "mask" image (1 where a pixel is in the color range, zero otherwise) and use a bitwise-and to overlay that onto the original, leaving only the colored pixels with a non-zero value. However, David realized that we could instead use a bitwise-OR to overlay the masks onto each other, and thus get a black-and-white "smear" image showing everywhere that the colored dot (the glove finger) has been throughout the gesture. This image could be directly compared to a set of templates we can create in the Pi's memory, since image-comparison is a well-explored problem in computer science (and you'd only have to do it once, at the end of the gesture, as opposed to the blob-finding algorithm that would have to run for each frame). Neither of us knows much about how to implement image-shape comparison, however, so we spent time looking up research papers on the subject (which Kjartan also helped us with). We now possess several possible solutions: Mean Squared Error (pixel-by-pixel comparison; most internet resources consider this to be inaccurate but fast), the Structural Similarity Index (developed by these researchers), a machine learning-based approach suggested by Alden (ideal for this purpose, but is significantly beyond our level of knowledge to implement), Keypoint Matching (suggested by Kjartan), and another paper found by Kjartan (whose authors I don't remember off the top of my head). We have not implemented any of these five yet, however, which is our goal for next week. Notably, this approach also somewhat helps with our gesture recognition and scaling difficulties (the other two unsolved problems Prof. Gonzalez gave us in our meeting), since that's incorporated (in theory) within the similarity indices, and we don't have to know the real-life size of the gestures to do this.