Web Application Security, Part 2

From CSE330 Wiki
Jump to navigationJump to search

This is Part 2 of the Web Application Security article, geared toward the material covered in Module 3. For material covered in Module 2 (HTML, CSS, and PHP), see Web Application Security, Part 1. For material covered in Module 6 (JavaScript), see Web Application Security, Part 3.

Password Security

Let's assume for a moment that despite all of your efforts in the other fronts of web security, an attacker was still able to extract information from your database. If you store your passwords as plain text, not only will the attacker be able to log in as whomever he chooses, but the attacker will also likely be able to log in as the users of your site on different sites (since many users employ the same password on several different web sites).

Hashing

In cryptography, a hash function is a function which maps an input to an output in a way that is (hopefully) impossible to undo. In other words, if you have the hashed version of a password and you want to figure out what password could've been used to create it, your only option is to try every possible password until one matches. With a secure hash function and strong password, that process will take many millions of years.

In web development, you should always store the hashed version of passwords. What this means is that you feed a string of text (a password) to a hash function, and that hash function returns another string of text that is a digest or hash of the password, and that digest is what you'll store in your database. The reason we use them is so that, if someone evil ever gains access to your database, they won't be able to easily figure out what passwords your users used for your website. People tend to reuse passwords, so a breach at your site could easily become a breach at other sites, too.

An important thing to note is that cryptographic hashes always have constant length, meaning that, in your database, you should be using a field of a fixed length to store them if you're using a fixed hashing algorithm. Because you're only storing the hashed version of a password, and not the original, it doesn't matter to you how big the original password was; you don't have to store it! As a result, if you ever see a website that limits your password to, say, 20 characters, that's a good sign that they're insecurely storing your password in plaintext.


Now, hashing is very important, but, by itself, it can be vulnerable to precomputation attacks, because a given string will always hash to the same value. In other words, an attacker could assemble a list of common passwords and their hashes, and simply look for those hashes in your database, saving them the work of trying to break each hash individually. To defend against such attacks, hashing algorithms can be used with so-called salts to increase their security. What this means is that the string to be hashed is modified using a randomly generated chunk of data called a salt before the hashing takes place. The salt is stored with the hash in the database, and, when a user enters their password, it is combined with the salt before the hash function is applied. The resulting hash defeats precomputation attacks, because it means that a given password no longer hashes to the same value across multiple databases, but as long as you're hashing the same string with the same salt, you'll always get the same results.

xkcd 221

Solution

So, the solution is to store salted, hashed passwords in your database. PHP provides the password-hash() function to do this for you.

password_hash is only available in PHP 5.5 or newer.

Note that, as part of generating a more secure hash, password_hash will output a longer hash, so make sure your database field is large enough to hold the whole hash - as of PHP 5.6, 61 characters or larger will do the trick.

How does they work? They're both automagical functions that use a random salt and hash your password. password_hash Takes two arguments: the string to hash and a constant representing the hashing algorithm to use. Let's look at a few examples.

<?php

echo password_hash("Hello World", PASSWORD_BCRYPT);
/*
 * The above line prints, for example: $2y$10$hfd4bRj2w2shMdxVZc7MEu3uZoXU7CNUEmUhWCEJB11xO8vTc8ply
 * This line is of the format:   $2$random-salt$hashed-password
 * Note that the actual hash generated will be different if you run this, because of a different random salt.
 */
?>

As you can see from the above example, The functions generate a string that contains both a random salt and the hashed password.

Password Storage

You should store the constant-length value in a CHAR field in your database, and then have just one field for both the password and the salt.

A note about password_hash: If you always want to use the latest recommended hashing algorithm, you can specify PASSWORD_DEFAULT as your hashing algorithm. As of PHP 7 and below, PASSWORD_DEFAULT is equivalent to PASSWORD_BCRYPT, but that may change in future versions. Because it's possible that a future hashing algorithm may produce larger hashes than BCrypt does, the maintainers of PHP recommend you use a field that can handle hashes of up to 255 characters in length. If, and only if you choose to use PASSWORD_DEFAULT, you may use a variable-length field that can hold 255 characters to store your hashes. In MySQL, for example, a field of type VARCHAR(255) would be acceptable here. However, because (during the timeframe of this class) there will not be any changes to the hashing algorithms used, you may still use a fixed-length field.

Checking Passwords

To check a password, one might simply compare the hash of the password a user inputs against the hash stored in your database using normal string comparison (i.e. the == operator). If they're the same, you're all set. Simple enough, right? Wrong. The fact is, security is really, really hard, and there are lots of things you have to be aware of in a security context that you don't normally need to worry about.

In this case, the == operator exposes you to what's called a timing attack. The string comparison algorithm used by == returns false as soon as a character is different in the two strings, meaning that, if the two strings have a different first character, it will return earlier than if they have a different last character. Normally, this is a good thing; it makes your code run faster. In this context, though, it's a bad thing, because an attacker can time how long the function takes to run, and, using that information, guess the characters of the string.

As an extra bit of fun, the == operator exhibits some behavior that's less than intuitive.

Try to guess what the output would be if you were to run the following code:

$a = "0e153958235710973524115407854157";
$b = "0e015339760548602306096794382326";
if ($a == $b) {
    echo "Passwords Match!";
} else {
    echo "Passwords Don't Match!";
}

Believe it or not, that snippet of code will output "Passwords Match!". The reason for this is that the == operator tries to convert strings to other types before testing their equality. In this case, the two strings both start with "0e", which PHP treats as the beginning of a floating-point number written in scientific notation (e.g. 2e10 == 2 * pow(10, 10)). Because the part of each string after the "0e" is an invalid exponent, PHP interprets the entire thing as 0. That's right: both $a and $b evaluate to 0, so the comparison is true.

In a security situation, that's a terrible thing - any passwords with hashes that start with 0e will always be reported as equal to one another.

Luckily, PHP, starting in version 5.5, has your back, with a function that solves both the problems above. You can use the password_verify() function to check a password securely.

Example

Here is an example of what you might do on your login page. Note: This example assumes that you hashed the user's password already and stored it in the database.

<?php
// This is a *good* example of how you can implement password-based user authentication in your web application.

require 'database.php';

// Use a prepared statement
$stmt = $mysqli->prepare("SELECT COUNT(*), id, hashed_password FROM users WHERE username=?");

// Bind the parameter
$user = $_POST['username'];
$stmt->bind_param('s', $user);
$stmt->execute();

// Bind the results
$stmt->bind_result($cnt, $user_id, $pwd_hash);
$stmt->fetch();

$pwd_guess = $_POST['password'];
// Compare the submitted password to the actual password hash

if($cnt == 1 && password_verify($pwd_guess, $pwd_hash)){
	// Login succeeded!
	$_SESSION['user_id'] = $user_id;
	// Redirect to your target page
} else{
	// Login failed; redirect back to the login screen
}
?>

Note: You may sometimes see functions like vanilla md5() or crypt() used to hash passwords. md5() does indeed perform one-way hashing, but it does so without a salt. THIS IS BAD PRACTICE, because unsalted md5 hashes can be trivially reversed using a rainbow table. (Just Google for "md5 decrypter".) Using a salt prevents the effective use of a rainbow table. Additionally, the md5 function, even with a salt, should not be used because it has been proved vulnerable.

Real-Life Examples

Here is a constantly-updated list of sites that do not use proper password security: http://plaintextoffenders.com/

Cross-Site Request Forgery

Watch on DOCTYPE

Cross Site Request Forgeries

A cross-site request forgery (CSRF, pronounced sea-surf) involves a victim, who is logged in to the targeted site, visiting an attacker’s site. The attacker has code on his site that forces the victim to unwittingly perform actions on the targeted site.

For example, suppose Mother Goose visited Dr. Evil's blog. Dr. Evil had the following tag embedded in his bloc:

<img src="http://www.bank.com/transfer.php?dest=dr-evil&amount=5000" />

This would cause Mother Goose to authorize a $5000 transfer to Dr. Evil, completely without Mother Goose's knowledge!

Worse yet, Dr. Evil could just send an e-mail to Mother Goose with this image tag. All Mother Goose would need to do to be attacked is open the e-mail! (Now you know why sometimes your e-mail client turns off images from suspicious sources.)

Solution

The first precautionary measure is to always use POST requests (as opposed to GET requests) for actions that change something on your server. This will fend off all except the most hard-core CSRF attacks.

However, fully preventing CSRF attacks is not difficult. To do this, you can use a CSRF token. A CSRF token is a known string of text that is submitted in all of the forms on your site. If the string is not what you expect, then you can assume that the request was forged.

For example, consider this form:

<form action="transfer.php">
    <input type="text" name="dest" />
    <input type="number" name="amount" />
    <input type="submit" value="Transfer" />
</form>

We can easily add a hidden CSRF token field like so (as well as making the form POST rather than GET):

<form action="transfer.php" method="post">
    <input type="text" name="dest" />
    <input type="number" name="amount" />
    <input type="hidden" name="token" value="<?php echo $_SESSION['token'];?>" />
    <input type="submit" value="Transfer" />
</form>

This assumes that $_SESSION['token'] contains an alphanumeric string that was randomly generated upon session creation. For example, you could add this line beneath beneath where the user successfully authenticates:

$_SESSION['token'] = bin2hex(openssl_random_pseudo_bytes(32)); // generate a 32-byte random string
// In PHP 7, you can use the following, better technique:
// $_SESSION['token'] = bin2hex(random_bytes(32));

We can now test for validity of the CSRF token on the server side (in transfer.php):

<?php
$destination_username = $_POST['dest'];
$amount = $_POST['amount'];
if(!hash_equals($_SESSION['token'], $_POST['token'])){
	die("Request forgery detected");
}
$mysqli->query(/* perform transfer */);
?>

Now, if Mother Goose were to view a page containing the malicious <img> tag, the transfer would not take place.

Real-Life Examples

SQL Injection

Exploits of a mom.png

SQL injection occurs when an attacker submits specially-crafted input into your server, which is then included in an SQL query. The input modifies the query to perform additional actions on the database or to access unwanted information.

For instance, suppose you had the following code:

<?php
require 'database.php';

/* DISCLAIMER: THIS CODE IS BAD IN MANY MORE WAYS THAN JUST
BEING VULNERABLE TO SQL INJECTION! IT IS FOR DEMONSTRATION OF
CONCEPT ONLY. DO NOT USE THIS CODE IN YOUR OWN PROJECTS! */

$res = $mysqli->query("SELECT id FROM users WHERE username='".$_POST['username']."' AND password='".$_POST['password']."'");

if( $res->num_rows==1 ){
    $row = $res->fetch_assoc();
    $_SESSION['user_id'] = $row["id"];
}else{
    echo "Login failed.";
    exit;
}
?>

This code is vulnerable to SQL injection. For example, suppose the attacker used the following string of text for his username:

mother-goose' --

Here's what the resulting query would look like:

SELECT id FROM users WHERE username='mother-goose' --' AND password=''

Since -- is the start of a comment in SQL, when MySQL interprets this query, it will completely ignore the password-checking part of the query! Dr. Evil can log in using anyone's username and steal all of their money!

Solution

If you write your queries manually (as in the example above), you need to use $mysqli->real_escape_string() to sanitize your input:

<?php
$safe_username = $mysqli->real_escape_string($_POST['username']);
// ...
?>

However, the better solution is to use prepared queries. For more information on prepared queries, see PHP and MySQL.

IMPORTANT: If you correctly use the sample code in the PHP and MySQL guide above, you are already safe from SQL injection attacks.

Real-Life Examples

Abuse of Functionality

Abuse of Functionality is a general term that refers to when an attacker exploits vulnerabilities in the logic of your application.

For example, suppose you were a banking site, and you had the following code to perform a transaction:

<?php
require 'database.php';

$amount = (double) $_POST['amount'];
$destination_username = $mysqli->real_escape_string($_POST['destination']);

$mysqli->autocommit(false); // start transaction
$mysqli->query("UPDATE users SET balance=balance-".$amount."
	WHERE id=".$_SESSION['user_id']);
$mysqli->query("UPDATE users SET balance=balance+".$amount."
	WHERE username='".$destination_username."'");
$mysqli->commit(); // commit transaction
?>

It may not be obvious, but if you don't filter your input, it is trivial for an attacker to insert a negative number into the "amount" field and transfer money from anyone's account to his account! The solution here is to simply filter amount for what you expect (in this case, a positive number that is not greater than the user's current balance).

Filtering on the Client Side is Never Enough

Suppose you have an e-mail field in a form in HTML, and you use some JavaScript function (or HTML5) to check it for form as an e-mail address:

<input type="text" name="email" onchange="checkEmail(this);" /> <!-- HTML versions ≤ XHTML 1.0 -->

<input type="email" name="email" /> <!-- HTML versions ≥ 5 -->

With this filter in place, any layperson using your form is now required to submit an e-mail address in that field. However, it is trivial for an attacker to bypass this client-side filtering (e.g., by using web developer tools like Firebug) and still submit non-email text to your server. This is why IT IS ESSENTIAL THAT YOU FILTER INPUT ON THE SERVER SIDE! Any sort of filtering on the client side are just bells and whistles for the end user.

Information Leakage

Information Leakage is a noteworthy type of Abuse of Functionality attack that involves unprivileged users accessing privileged information. In fact, Information Leakage accounts for a significant percentage of all recorded web application vulnerabilities (second only to Cross-Site Scripting).

The concept is relatively simple. Suppose you have an administration page that loads an image containing a graph of all activity on your site:

admin.php

<?php
if($_SESSION['admin']) echo '<img src="stats.php?day=2012-08-19" alt="Stats for 2012-08-19" />';
?>

stats.php

<?php
// query the database, and save the results in $result

$im = new PNGraph();
$im->takeData($result);

header("Content-Type: image/png");
print $im->toString();
?>

Notice how you check for admin credentials in admin.php. However, you forgot to do this in stats.php itself. An attacker could simply load stats.php directly to see all of the sensitive admin-only information!

Solution

Information Leakage is an attack that requires the developer to see the big picture and really keep track of what's going on in his or her application. As a rule of thumb, whenever you query the database to access sensitive information, check the permissions of the user first.

A Note about Frameworks

Web frameworks are designed to make web development more agile, but they in turn have security weaknesses of their own. For instance, the infamous mass-assignment vulnerability in Ruby on Rails-type MVC frameworks is an abuse of functionality vulnerability that enables attackers to save arbitrary information in your database (!).

The important thing to know is that if you use a web framework, be familiar with the security considerations with that framework. Most of the time, frameworks will have articles on their web sites that discuss these concerns. (If your framework doesn't have a guide like this, you should probably be using a different framework!)

Real-Life Examples

CAPTCHA Abuse

CAPTCHAs are a great way to stop robots from submitting forms in your web application. However, depending on how you implement CAPTCHAs, it might be possible for a hacker to bypass the CAPTCHA on your site.

For example, suppose you are making a home-grown CAPTCHA system for your application. You have a table CaptchaTable with the following fields: id and value. Your form looks like this:

<form action="signup.php" method="post">
	<strong>Subscribe to our newsletter</strong>
	<input type="email" name="email" placeholder="Enter Your Email Here" />
	<input type="text" name="captcha_value" placeholder="Type what you see in the CAPTCHA below." />
	<img src="captcha.php?id=12345" />
	<input type="hidden" name="captcha_id" value="12345" />
	<input type="submit" value="Sign Up" />
</form>

Here's what the code on signup.php might look like:

<?php
$email = $_POST["email"];
$captcha_id = (int) $_POST["captcha_id"];
$captcha_value = $_POST["captcha_value"];

// check CAPTCHA

$captcha_stmt = $mysqli->prepare("select count(*) as match from CaptchaTable where id=? and value=?");
$captcha_stmt->bind_param("is", $captcha_id, $captcha_value);
$captcha_stmt->execute();
$captcha_row = (int) $captcha_stmt->fetch_assoc();
$captcha_stmt->close();

if ($captcha_row["match"] == 0){
	http_response_code(403);
	echo "Invalid CAPTCHA";
	exit;
}

// continue with signup procedure down here
?>

The issue here is that a hacker can simply solve one CAPTCHA and then send that CAPTCHA ID and Value over and over again!

Solution

If you already have a home-grown CAPTCHA system, the best solution here would be to simply issue another query to the database that deletes a CAPTCHA id-value pair as soon as it is submitted in a form.

If you haven't yet implemented your CAPTCHA system, you might want to consider a third-party solution like reCAPTCHA.