BusyBear

From ESE205 Wiki
Jump to navigation Jump to search

Project Proposal

Overview

It always seems like an impossible task to find an open table to work or a quick line for food across the WashU campus. BusyBear's goal is to create a database that is accessible to WashU students that will show the population and busyness trends of popular locations on campus, beginning with Bear's Den. By using a network adapter connected to the Raspberry Pi, we will receive an approximate measurement of busyness based on the number of found MAC addresses for a specific region. By looking at pictures taken simultaneously with the MAC address collection, a historic trend between the number of found MAC addresses and relative busyness can be determined. We hope to be able to store this information in a database hosted by AWS and display this data on a website. Our end goal is to gather information to allow the WashU community to create more educated decisions regarding where to go and when to go there.

Team Members

Thomas Emerson
Tom Goon
Allison Todd
David Tekien, TA
Jim Feher, Instructor

Links

[Project Log]
[Project Presentation]
[GitHub Repository]
[Network Adapter Monitoring Mode Tutorial]

Objectives

  • Learn and be able to code in Python as related to the Pi
  • Use sniffing/MAC tracking method in the analysis of busyness
  • Investigate the use of the camera in the analysis of busyness
  • Be able to monitor busyness fairly accurately by filtering detected devices
  • Compare busyness at different times of day and between buildings
  • Design a GUI for an aesthetically pleasing and useful website
  • Host a website displaying useful and relevant data through Amazon Web Services (AWS)

Challenges

  • Limited experience with working with WiFi receivers or anything to do with MAC Addresses
  • Limited knowledge of Python and Raspberry Pi
  • Connecting our data with a database, AWS, and a website
  • Privacy Concerns

Gantt Chart

GanttChart 1.png

Budget

Item Description Cost Link
AWS Website Hosting $5 / month https://aws.amazon.com/pricing/?nc2=h_ql_pr
2 x TL-WN722N Network Adapter returned: $7.21 https://www.amazon.com/TP-Link-TL-WN722N-Wireless-network-Adapter/dp/B002SZEOLG
1 x 5dBi Long Range WiFi for Raspberry Pi Network Adapter returned: $5.00 https://www.amazon.com/5dBi-Long-Range-WiFi-Raspberry/dp/B00YI0AIRS/ref=lp_9026119011_1_1?srs=9026119011&ie=UTF8&qid=1550447401&sr=8-1
1 x Alfa AWUSO36NH High Gain USB Wireless G/N Long-Range WiFi Network Adapter Network Adapter $31.99 https://www.amazon.com/Alfa-AWUSO36NH-Wireless-Long-Rang-Network/dp/B0035APGP6/ref=sr_1_1_sspa?keywords=alfa+network+adapter&qid=1553045771&s=gateway&sr=8-1-spons&psc=1
mybusybear.com Domain Name $12.00 DomainPrice.jpg
Total Cost $71.20

Design and Solutions

Build the Device

We began by constructing a device to collect MAC addresses. Initially, we hoped that with the RaspberryPi's WiFi capabilities, we could simply use the base hardware for detection. We quickly determined that the RaspberryPi was not capable of entering a monitoring mode[1]; we would need external hardware to serve this purpose. We went through a variety of external network adapters, and ultimately found one with both monitoring mode capabilities and compatibility with the RaspberryPi [2]. Using the Network Adapter's functionality will be explored further in the Collect Information section and in the Network Adapter in Monitoring Mode tutorial[3].
We decided that a RaspberryPi camera should be added to the device to strengthen the validity of the data gathered from the network adapter. The Pi Camera is fairly simple to connect and the functionality is implemented through Pi commands[4]. By analyzing a combination of the number of addresses collected and the visual busyness found in the picture, more accurate trends over time can be determined.

Collect Information

In the before mentioned tutorial [Network Adapter Monitoring Mode Tutorial] we established how setup the network adapter in monitoring mode and install kismet, the software we used to utilize monitoring mode. Once properly configured, simply calling kismet spews out text into the console as so:

pi@raspberrypi:~ $ kismet
INFO: Including sub-config file: /usr/local/etc/kismet_httpd.conf
INFO: Including sub-config file: /usr/local/etc/kismet_memory.conf
INFO: Including sub-config file: /usr/local/etc/kismet_alerts.conf
INFO: Including sub-config file: /usr/local/etc/kismet_80211.conf
INFO: Including sub-config file: /usr/local/etc/kismet_storage.conf
INFO: Including sub-config file: /usr/local/etc/kismet_logging.conf
INFO: Including sub-config file: /usr/local/etc/kismet_uav.conf
INFO: Loading config override file '/usr/local/etc/kismet_site.conf'
INFO: Optional sub-config file not present: /usr/local/etc/kismet_site.conf
KISMET - Point your browser to http://localhost:2501 for the Kismet UI
      control.
INFO: Setting default channel hop rate to 1/sec
INFO: Enabling channel list splitting on sources which share the same list
      of channels
INFO: Enabling channel list shuffling to optimize overlaps
INFO: Sources will be re-opened if they encounter an error
INFO: Saving datasources to the Kismet database log every 30 seconds.
INFO: Launching remote capture server on 127.0.0.1:3501
ALERT: LOGDISABLED Logging has been disabled via the Kismet config files
       or the command line.  Pcap, database, and related logs will not be
       saved.
INFO: Probing interface 'mon1' to find datasource type
INFO: Logging disabled, not enabling any log drivers.
INFO: Starting Kismet web server...
INFO: Started http server on port 2501
INFO: Found type 'linuxwifi' for 'mon1'
INFO: Interface 'mon1' is already in monitor mode
INFO: System-wide wireless regulatory domain is set to '00'; this can
      cause problems setting channels.  If you encounter problems, set the
      regdom with a command like 'sudo iw reg set US' or whatever country
      is appropriate for your location.
INFO: Detected new 802.11 Wi-Fi access point 28:AC:9E:80:86:E1
INFO: Detected new 802.11 Wi-Fi access point 00:A7:42:FC:6E:01
INFO: 802.11 Wi-Fi device 00:78:88:30:4E:C3 advertising SSID
      'wustl-guest-2.0'
INFO: 802.11 Wi-Fi device 28:AC:9E:80:86:E1 advertising SSID
      'wustl-guest-2.0'
INFO: Detected new 802.11 Wi-Fi access point 28:AC:9E:80:86:E3
INFO: 802.11 Wi-Fi device 00:A7:42:FC:6E:01 advertising SSID 'WUSM-secure'
INFO: Detected new 802.11 Wi-Fi access point 00:A7:42:FC:6E:03
INFO: 802.11 Wi-Fi device 28:AC:9E:80:86:E3 advertising SSID 'eduroam'
INFO: Detected new 802.11 Wi-Fi access point 00:A7:42:FC:6E:05
INFO: 802.11 Wi-Fi device 00:A7:42:FC:6E:03 advertising SSID
      'wustl-guest-2.0'
INFO: Detected new 802.11 Wi-Fi access point 00:A7:42:FC:6E:00
INFO: 802.11 Wi-Fi device 00:A7:42:FC:6E:00 advertising SSID 'eduroam'
INFO: Detected new 802.11 Wi-Fi device 74:B5:87:C6:90:1E
INFO: Detected new 802.11 Wi-Fi device 00:08:E3:FF:FD:EC
INFO: Detected new 802.11 Wi-Fi device 02:A7:42:FC:6E:00
INFO: Detected new 802.11 Wi-Fi device 2A:AC:9E:80:86:E0
INFO: Detected new 802.11 Wi-Fi device 8C:45:00:04:DA:8F

We could collect data, but we needed a way to be able to consolidate the MAC Addresses, find out what device it belonged to, and upload that information to the database all periodically throughout a gap of time. Over a couple of months, we finalized our data collection design to utilize crontab [5] , a program used the schedule execution of programs at certain times. Crontab was also utilized to setup emailing the Pi's IP address on boot, as detailed in a class tutorial. [6] Crontab was used to schedule two tasks: Running kismet and dumping the output into a text file, and running a script to parse through the text file and upload the necessary information.

The crontab usage can be seen below. The first program runs kismet through the timeout modifier [7] such that it only runs for four minutes. All the contents of the output is written to the the kismetlog.txt

# m h  dom mon dow   command
*/5 *  *   *   *    /usr/bin/timeout 240 /usr/local/bin/kismet > kismetlog.txt
*/5 *  *   *   *    ./busybear2

The second task to run is an executable file named "busybear2" [8] whose contents is shown below.

 
sleep 242s
python3 uploader.py

This bash script's only purpose is to wait 4 minutes and 2 seconds (essentially waiting for the kismet task to terminate) before executing the uploader script. The contents of "uploader.py" can be seen below.

import re
import requests
import json

# bottle is the module which will manage our API routes/requests
from bottle import route, run, static_file

# mysql connector allows us to connect to a mysql database
import mysql.connector

# load our config json file into a python dictionary
config = json.loads(open('config.json').read())

# establish mysql connection
db = mysql.connector.connect(
  host=config['mysqlHost'],
  user=config['mysqlUser'],
  passwd=config['mysqlPassword'],
  database=config['mysqlDatabase']
)

# the cursor is used to query the database
cursor = db.cursor() 

# Format to enter stuff into the database
qString = 'INSERT INTO wifiMAC (macAdd, vendor) VALUES (%s, %s)'

print('Connection established')

# Regular Expression, only gets MAC Addresses after it sees "device"
MAC_regex = re.compile(r"(?<=\bdevice\b\s)\w\w[:]\w\w[:]\w\w[:]\w\w[:]\w\w[:]\w\w")

# Insert the textfile with the raw text of kismet into an object
testFile = open("kismetlog.txt","r")

# URL for MAC Address lookup API
MAC_URL = 'http://macvendors.co/api/%s'


# Loop through the lines of the file to find MAC Addresse
for line in testFile:
  MAC_addresses = MAC_regex.findall(line) # Compile all found mac addresses in var MAC_addresses
  for address in MAC_addresses: # Loop through the individual MAC Addresses
    req = requests.get(MAC_URL%address)
    obj = req.json()
    for key, value in obj.items():
      if('company' in value):
        values = (address,value['company'])
      else:
      	values = (address,'Null')
      cursor.execute(qString,values)

db.commit()

print('Database updated')

The uploader script uses regular expressions [9] [10] [11] in order to isolate the desired MAC addresses from the text file constructed in the first crontab task. From there, it utilizes a MAC Address Vendor Lookup API [12] in order to attach a vendor name to a MAC Address. From there, the MAC Address and its associated vendor is uploaded to our mySQL database [13] whereupon it is automatically assigned a timestamp and unique ID. More on how our database was created and structure in the next section. NOTE: config.json is used and not shown for privacy. This holds our login credentials for our database.

Managing a Database

RDD database through AWS named BusyBear. Which we can access through MySQL workbench. From the database, we created multiple tables to story current MAC addresses and historical information. It was important for us that the table storing MAC addresses auto populate timestamps with the current time. As far as externally connecting the database to both the Pi and the website, we found that using a combination of python and PHP was the most effective. It is important to note, however, to keep the database secure we had to escape all incoming data to prevent SQL injection attacks. As of now, our database isn't in 3rd normal form however this is a goal we have to store and access data most efficiently. One hurdle was attempting to store pictures in the database (as LONGBLOB) but we discovered that because a) the complexity caused instability, and b) picture's large size made queries take an exceptional amount of time. It was best to just upload images directly to the website and bypass the database.

Create a Website

We initially imagined that our website would flow like this:
Website Design Flowchart
At the moment our plan is to create a website very similar to this, but with direct access to the available location from the home page. Our current website should function in the following: Website Design Flowchart

The Website is hosted by AWS and uses a LightSail instance running LAMPS (which comes preinstalled with apache2, PHP, and MySQL support) which made the setup significantly easier. Each page of the website is a PHP, CSS, HTML, and Javascript hybrid which uses PHP to access data from and to the database while HTML, CSS, and Javascript are used to display this information. Specifically, we use Javascript functions created by Google as well as code from https://canvasjs.com/html5-javascript-bar-chart/ to display the relevant graphs. To make the website more professional, we opted to buy (rent) a domain name for Amazon Route 53. This required us to rework the DNS preferences of the domain but ultimately was successful in connection to our Lightsail instance. Our final website can be found here: http://mybusybear.com

Put It All Together

While the RaspberryPi gives us data and the website gives us a way to communicate data, we next need to combine these pieces in an understandable and usable manner. We need to fully understand the relationship between the number of MAC addresses found and the actual level of busyness as it relates to real-world environments. Because we have yet to collect data over time and relate this number to the people in a specific area, we cannot expand on our complete thought process yet. At the moment, we imagine there will be a linear relationship between the number of MAC addresses and the relative busyness. As the number of MAC addresses increase, the level of busyness will directly increase. At the moment we are attempting to determine at exactly what rate this change occurs. This relationship will be determined by looking at the number of MAC addresses and comparing that number to the number of people visible in our captured image. By looking at this relationship over time, a general understanding can be determined for our specific location on campus, Bear's Den.

Results

Next Steps

There are several areas where we can improve upon or explore.

  1. Increasing capacity in which the camera is used: Currently, the pictures captured by the Pi are strictly for reference only. It would be a more robust system if some sort of human detection was implemented and automatically analyzed to refine our busyness level.
  2. Privacy and security concerns: Obviously nothing strictly illegal was done, however, that does not mean there are no worries. While sniffing packets is done regularly by numerous devices, apps, etc. there is still a concern of proper ethical considerations into tracking and monitoring people. Additionally, pictures being taken and stored could potentially capture damaging information. All of this paired with the fact we did not encrypt any information is certainly a front in which we would strengthen.
  3. Expanding the range and operation. With only one Pi and its limitations in that it is a rather bulky physical object means that it is hard to find a consistent place to operate. Addressing this and expanding where we operate would certain broaden the scope of the project.

References

Not Quoted

Past Projects

Pi Blinking LED (tutorial sake)

nmap (unused in the end)

fping (unused in the end)

openCV (unused in the end)

kismet & monitoring mode (referenced in our tutorial)

Regex/Dictionary/API

Quoted

  1. Pi Network - [1]
  2. Network Adapter - [2]
  3. Network Adapter in Monitoring Mode Tutorial - [3]
  4. Pi Camera - [4]
  5. How to use Crontab - [5]
  6. SSHing into your Pi Tutorial - [6]
  7. Using timeout with crontab - [7]
  8. How to make a file executable - [8]
  9. CSE 330 Wiki: Regular Expressions - [9]
  10. Regular expressions look-ahead/behind - [10]
  11. Online Regular Expression tester - [11]
  12. MAC Address Vendor Lookup API - [12]
  13. Uploading data to a mySQL database - [13]