Difference between revisions of "Eye Tracking for ALS Patients"

From ESE497 Wiki
Jump to navigationJump to search
 
(28 intermediate revisions by the same user not shown)
Line 1: Line 1:
=Eye Tracking-Project: Image Selection and Compression=
 
This project is tackling the Image Processing component of the CubeSat project. In particular, dealing with selection of valuable of image data and the effective compression of that image data.
 
  
== The Project ==
 
Project Timeline: [[Media:ISC_Timeline.docx|Download]]
 
  
Project Poster: [[File:UGR_cubsat2_poster_SP11.ppt|Download]]
+
===Background===
 +
 
 +
This research was conducted by Sana Naghipour and Saba Naghipour in the Fall 2011 Semester at Washington University in Saint Louis. It was part of the Undergraduate Research Program and taken for credit as the course ESE 497 under the Electrical and Systems Engineering Department. The project was overseen by Dr. Arye Nehorai,Ed Richter and Phani Chavali.
 +
===Project overview===
 +
 
 +
The Eye tracker project is a research effort to empower people, who are suffering from Amyotrophic Lateral Sclerosis (ALS), to write using their eyes by tracking the movement of the pupil. The project will be implemented in two main phases:
 +
 
 +
First Phase: development of the software for pupil tracking
 +
 
 +
Second Phase: building the hardware necessary to capture the images of the eye and transfer the images to a processing unit.
 +
 
 +
===Acknowledgment===
 +
 
 +
We would like to show our greatest appreciation to Prof. Nehorai and Ed Richtor for providing us the research opportunity and feedbacks for the successful completion of the research.  
  
== Project Report ==
+
We are highly indebted to our mentor, Phani Chaveli for his guidance and constant supervision as well as for providing necessary information regarding the project.
  
===Project abstract===
+
===Introduction===
The goal of this project is to examine the performance of different lossless compression algorithms and determine which one would be the most suitable for a cubeSat based communication system. As the cubeSat based communication system has both high data collection potential and relatively low data transmission throughput, optimal compression is a key performance bottleneck for the overall system performance. We select an optimal subset of the images that are generated using the IR camera mounted on a cubeSat and compress these subset of images. The compression algorithms we investigated and tested are run-length encoding, difference encoding, and Huffman coding. Initially, we implement these algorithms in MATLAB. Later, we will use the datasets developed in the first part of the cubeSat project.
 
  
===Background===
+
First Phase: In this phase of the project, we focus on software development. We will use an infrared camera to capture the images of a human eye using Lab view. We employ template matching methods to locate the center of the pupil, where we will use a small patch of dark pixels as a template. We will propose adaptive search methods, in which we choose the search space in each frame based on the estimates of the pupil location from the previous frames. The proposed adaptive search will greatly reduce the computational complexity of the algorithm, which is essential for the real time tracking. We will also work on developing hybrid methods that use template matching along with feature selection to develop robust and computationally inexpensive algorithms that are insensitive to noise in the images and orientation of the camera that is used to obtain the images.
 +
Second Phase: In the second phase of the project, we will build the hardware system that can record the images using a camera mounted on a pair of sunglasses. We will port the software that we develop during the first phase on to a suitable microcontroller, which does the processing and generates the control signal which can be used to move the prosthetic limbs. We will also develop hardware to transmit the recorded images to the micro-controller wirelessly.
 +
Eye tracking project would help patients in various tasks such as communication, writing emails, drawing and making music. More advanced applications of this project are: cognitive studies, laser refractive surgery, computer usability, translation process research, infant research, sport training and commercial eye tracking.
  
This research was conducted by Sana Naghipour and Saba Naghipour in the Fall 2011 Semester at Washington University in Saint Louis. It was part of the Undergraduate Research Program and taken for credit as the course ESE 497 under the Electrical and Systems Engineering Department. The project was overseen by Dr. Arye Nehorai,Ed Richter and Phani Chavali.
 
  
 +
===System Setup===
  
===Overview of compression methods===
+
We use Lab view to capture the video using an infrared camera. There is support for recording of videos with several frame rates, and formats.  After obtaining the video, we perform sequential frame by frame processing.
*run-length encoding (RLE)- takes advantage of repeats in the data, stores a data element followed by a encoding element that states how many times that data repeats.
 
  
*variable run-length encoding (VRLE)- similar to RLE,  but has one bit after each data element that says whether or not there will be an encoding element, this saves space when there are not a lot of repeats. in the data.
 
  
*difference encoding (DIFF)- the method takes advantage of the variance of the data, it stores one data element that is the seed value. All the following data are represented by an encoding element that is the difference between the current data value and the previous data value
 
  
*Huffman coding (Huffman)- Huffman coding uses the probability distribution of the data values to construct a uniquely decodable map of variable length bit codes. Each bit code corresponds to some data element. the more frequent data have short bit codes and the less frequent ones have the long bit codes.  
+
[[File:SETUP1.jpg|200px|Experiment Setup]]
 +
[[File:SETUP2.jpg|500px|Experiment Setup]]
  
 +
===Tracking===
  
 +
In this part we used different methods that helped us overcome the challenges we faced during the project. The technics we used are:
  
===Work completed===
+
'''Discarding Color information:'''
The following image compression algorithms have been implemented and tested for accuracy in MATLAB:
+
We convert the images from all the frames into to their corresponding gray scale images. To do this, we average the pixel values in the entire three color channel to obtain a gray scale image.
*run length encoding
 
*variable run length encoding
 
*difference encoding
 
*Huffman coding
 
  
source code (zipped)-[[File:Cubesat_compression_source_SP11.zip]]
+
'''Low pass filtering:'''
 +
We use low-pass filtering to remove the sharp edges in each image. This also helps to remove the undesired background light in the image.
 +
[[File:LPF1.jpg|200px|Low Pass Filtering]]
 +
[[File:LPF2.jpg|200px|Low Pass Filtering]]
 +
[[File:lpf.jpg|200px|Low Pass Filtering]]
  
note- the code is decently commented, however it may be a bit confusing. So, for simplicity one should run the script called "Compression test" this script should prompt the user for a image to process and then, given a valid filename perform an array of compression tests. After finishing it should display the results of the operations graphically.
 
  
  
 +
'''Scaling:'''
 +
We scale down the filtered images to obtain lower resolution images. This serves two purposes. First, since the dimension of the image decreases, scaling improves the processing time. Second, the averaging effect removes the undesired background light.
  
===Problems encountered===
+
'''Template Matching:'''
The main problems I ran into had to do with the fact that i was using matlab. It seems like the way matlab works, it needs to store the data as a contiguous array and there are no other supported abstract data types. Because of the high memory overhead for the algorithms, particularly the Huffman, I would often run out of memory when working with a large pixel bit size or a large image. I never really found a good solution to this problem. However, I expect this problem to be alleviated when using a different language. Another problem i encountered was matlab's "native" tree structure support. I found it hard to work with and somewhat lacking in functionality (in particular, it was hard to build a tree from the bottom up, which is required for Huffman). My solution was to write a small tree class function library.
+
We used a template matching algorithm to segment the darkest region of the image. Since after discarding the color information and, low-pass filtering, the pupil corresponds to the darkest spot in the eye, this method was used. We used a small patch of dark pixels as a template. The matching is done using exhaustive search over the entire image. Once a match is found, the centroid of the block was determined to the pupil location.  For the experiments, we used a block size of 5 x 5 pixels.
  
===Experimental setup===
+
'''Determining the search space:'''
To test how effective each algorithm was at compressing a variety of images, I selected a few random picture and ran the compression algorithms for each one, with varying encoding element sizes. I then looked at which algorithm had the best compression performance, that is, the smallest compression ratio (compressed/original). I also ran the algorithms on images generated by code from Michael Scholl. those images were supposed to be somewhat representative of the kind of images we'll be getting.
+
Since the exhaustive search over the entire image to find a match is computationally intensive, we propose an adaptive search method. Using this method, we choose the search space based on the pupil location from earlier frame. In this manner, using the past information, we were able to greatly reduce the complexity of the search.  We used a search space of 75 X 75 pixels around the pupil location from the last frame.
  
  
 
===Results===
 
===Results===
While compression performance varied based on the actual image. The huffman was most consistently the best in terms of raw compression. However, as stated before, the memory and processing overhead was much higher. While RLE encoding could execute in fractions of a second, Huffman would take many seconds. Furthermore, for optimum compression performance with Huffman, there is also a communication overhead which might render any compression benefits useless.(note, look at the poster powerpoint to see examples of results)
 
  
 +
[[File:RESULT4.jpg|200px|RESULT]]
 +
[[File:RESULT1.jpg|200px|RESULT]]
 +
[[File:RESULT3.jpg|200px|RESULT]]
 +
[[File:RESULT2.jpg|200px|RESULT]]
 +
 +
 +
===Conclusion & Future work===
 +
 +
An algorithm for estimating the position and the movement of the pupil is implemented using a template matching method. In the future, we will build the necessary hardware that uses the algorithm for prosthetic limb control.
 +
 +
====Source Files ====
 +
[[File:Code.zip|Final Code]]
  
 +
[[File:Code.zip|Final Code]]
  
===Future work===
+
[[File:powerpoint.ppt|Presentation]]
*analyze the expected time complexity and space complexity of the algorithms
 
*Implement the algorithms in a more general purpose language like C.
 
*Get hands on some hardware and attempt to run the code. examine factors like power consumption processing speed, etc.
 
  
====Related Works====
+
[[File:poster.ppt|poster]]
#[[CubeSat:_Image_Modeling_and_Selection|CubeSat - Image modeling and selection (Michael Scholl, Phani)]]
 
#[[CubeSat 2010| CubeSat Projects (Michael, Alex G., Andrew)]]
 
#[http://cubesat.slu.edu Saint Louis University CubeSat Project]
 

Latest revision as of 21:38, 19 December 2011


Background

This research was conducted by Sana Naghipour and Saba Naghipour in the Fall 2011 Semester at Washington University in Saint Louis. It was part of the Undergraduate Research Program and taken for credit as the course ESE 497 under the Electrical and Systems Engineering Department. The project was overseen by Dr. Arye Nehorai,Ed Richter and Phani Chavali.

Project overview

The Eye tracker project is a research effort to empower people, who are suffering from Amyotrophic Lateral Sclerosis (ALS), to write using their eyes by tracking the movement of the pupil. The project will be implemented in two main phases:

First Phase: development of the software for pupil tracking

Second Phase: building the hardware necessary to capture the images of the eye and transfer the images to a processing unit.

Acknowledgment

We would like to show our greatest appreciation to Prof. Nehorai and Ed Richtor for providing us the research opportunity and feedbacks for the successful completion of the research.

We are highly indebted to our mentor, Phani Chaveli for his guidance and constant supervision as well as for providing necessary information regarding the project.

Introduction

First Phase: In this phase of the project, we focus on software development. We will use an infrared camera to capture the images of a human eye using Lab view. We employ template matching methods to locate the center of the pupil, where we will use a small patch of dark pixels as a template. We will propose adaptive search methods, in which we choose the search space in each frame based on the estimates of the pupil location from the previous frames. The proposed adaptive search will greatly reduce the computational complexity of the algorithm, which is essential for the real time tracking. We will also work on developing hybrid methods that use template matching along with feature selection to develop robust and computationally inexpensive algorithms that are insensitive to noise in the images and orientation of the camera that is used to obtain the images. Second Phase: In the second phase of the project, we will build the hardware system that can record the images using a camera mounted on a pair of sunglasses. We will port the software that we develop during the first phase on to a suitable microcontroller, which does the processing and generates the control signal which can be used to move the prosthetic limbs. We will also develop hardware to transmit the recorded images to the micro-controller wirelessly. Eye tracking project would help patients in various tasks such as communication, writing emails, drawing and making music. More advanced applications of this project are: cognitive studies, laser refractive surgery, computer usability, translation process research, infant research, sport training and commercial eye tracking.


System Setup

We use Lab view to capture the video using an infrared camera. There is support for recording of videos with several frame rates, and formats. After obtaining the video, we perform sequential frame by frame processing.


Experiment Setup Experiment Setup

Tracking

In this part we used different methods that helped us overcome the challenges we faced during the project. The technics we used are:

Discarding Color information: We convert the images from all the frames into to their corresponding gray scale images. To do this, we average the pixel values in the entire three color channel to obtain a gray scale image.

Low pass filtering: We use low-pass filtering to remove the sharp edges in each image. This also helps to remove the undesired background light in the image. Low Pass Filtering Low Pass Filtering Low Pass Filtering


Scaling: We scale down the filtered images to obtain lower resolution images. This serves two purposes. First, since the dimension of the image decreases, scaling improves the processing time. Second, the averaging effect removes the undesired background light.

Template Matching: We used a template matching algorithm to segment the darkest region of the image. Since after discarding the color information and, low-pass filtering, the pupil corresponds to the darkest spot in the eye, this method was used. We used a small patch of dark pixels as a template. The matching is done using exhaustive search over the entire image. Once a match is found, the centroid of the block was determined to the pupil location. For the experiments, we used a block size of 5 x 5 pixels.

Determining the search space: Since the exhaustive search over the entire image to find a match is computationally intensive, we propose an adaptive search method. Using this method, we choose the search space based on the pupil location from earlier frame. In this manner, using the past information, we were able to greatly reduce the complexity of the search. We used a search space of 75 X 75 pixels around the pupil location from the last frame.


Results

RESULT RESULT RESULT RESULT


Conclusion & Future work

An algorithm for estimating the position and the movement of the pupil is implemented using a template matching method. In the future, we will build the necessary hardware that uses the algorithm for prosthetic limb control.

Source Files

File:Code.zip

File:Code.zip

File:Powerpoint.ppt

File:Poster.ppt