Difference between revisions of "Web Application Security, Part 1"

From CSE330 Wiki
Jump to navigationJump to search
 
(16 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
Application-level web security is of increasing concern among web developers.  This article outlines some types of security threats to your web application and how to solve those threats.
 
Application-level web security is of increasing concern among web developers.  This article outlines some types of security threats to your web application and how to solve those threats.
  
This is Part 1 of the Web Application Security article, geared toward the material covered in [[Module 2]].  For material covered in [[Module 3]] (MySQL), see [[Web Application Security, Part 2]].  For material covered in [[Module 4]] (JavaScript), see [[Web Application Security, Part 3]].
+
This is Part 1 of the Web Application Security article, geared toward the material covered in [[Module 2]].  For material covered in [[Module 3]] (MySQL), see [[Web Application Security, Part 2]].  For material covered in [[Module 6]] (JavaScript), see [[Web Application Security, Part 3]].
  
 
== Introduction to Application-Level Web Security ==
 
== Introduction to Application-Level Web Security ==
Line 11: Line 11:
 
Here's the golden rule: Anything in your site that accepts user input, whether via a form, an AJAX request, a file upload, or even malformed links, can be used as an attack vector.  '''NEVER TRUST USER INPUT!!!'''  This can be summarized in the acronym FIEO, or ''Filter Input, Escape Output''.
 
Here's the golden rule: Anything in your site that accepts user input, whether via a form, an AJAX request, a file upload, or even malformed links, can be used as an attack vector.  '''NEVER TRUST USER INPUT!!!'''  This can be summarized in the acronym FIEO, or ''Filter Input, Escape Output''.
  
== Cross-Site Request Forgery ==
+
=== FIEO in PHP ===
  
A cross-site request forgery (CSRF, pronounced ''sea-surf'') involves a victim, who is logged in to the targeted site, visiting an attacker’s site.  The attacker has code on his site that forces the victim to unwittingly perform actions on the targeted site.
+
==== Filtering Input ====
  
For example, suppose Mother Goose visited Dr. Evil's blogDr. Evil had the following tag embedded in his bloc:
+
"Filter Input" means that you should check that input data is of the format that you are expecting.  For example, if you are expecting a number, you should cast it to a float or an int. If you are expecting a phone number, you should run it through a regular expression (you will learn regular expressions in module 4)For example:
  
<nowiki><img src="http://www.bank.com/transfer.php?dest=dr-evil&amp;amount=5000" /></nowiki>
+
<source lang="PHP">
 +
<?php
 +
// Cast a number to a float or an int:
 +
$amount = (float) $_POST['amount'];
  
This would cause Mother Goose to authorize a $5000 transfer to Dr. Evil, completely without Mother Goose's knowledge!
+
// Pass a phone number through a regular expression:
 +
$phone = preg_match('/\d{3}-\d{3}-\d{4}/', $_POST['phone']) ? $_POST['phone'] : "";
 +
?>
 +
</source>
  
Worse yet, Dr. Evil could just send an e-mail to Mother Goose with this image tag.  All Mother Goose would need to do to be attacked is open the e-mail!  (Now you know why sometimes your e-mail client turns off images from suspicious sources.)
+
==== Escaping Output ====
  
=== Solution ===
+
"Escape Output" means that you need to nullify, or ''escape'', characters that have special meaning in the markup language of interest.  For example, consider the following string:
  
The first precautionary measure is to always use POST requests (as opposed to GET requests) for actions that change something on your server.  This will fend off all except the most hard-core CSRF attacks.
+
<source lang="html4strict">
 
+
If a<b and b<c then a<c.
However, fully preventing CSRF attacks is not difficult.  To do this, you can use a '''CSRF token'''.  A CSRF token is a known string of text that is submitted in all of the forms on your site.  If the string is not what you expect, then you can assume that the request was forged.
+
</source>
  
For example, consider this form:
+
Since a less-than sign means the start of a tag in HTML, and '''b''' is a valid tag name, the above string will ''not'' render as you might expect in HTML.  Therefore, we need to ''escape'' our less-than signs by using HTML entities:
  
<source lang="html">
+
<source lang="html4strict">
<form action="transfer.php">
+
If a&lt;b and b&lt;c then a&lt;c.
<input type="text" name="dest" />
 
<input type="number" name="amount" />
 
<input type="submit" value="Transfer" />
 
</form>
 
 
</source>
 
</source>
  
We can easily add a hidden CSRF token field like so (as well as making the form POST rather than GET):
+
The "&lt;" is an '''HTML entity''' that will render as a less-than sign.  (For more information on HTML entities,
 +
[https://webplatform.github.io/docs/html/entities read this article on the WebPlatform wiki].)
  
<source lang="html">
+
PHP provides a function that, given a string, will convert special characters to their HTML entity equivalents.
<form action="transfer.php" method="post">
 
<input type="text" name="dest" />
 
<input type="number" name="amount" />
 
<input type="hidden" name="token" value="<?=$_SESSION['token'];?>" />
 
<input type="submit" value="Transfer" />
 
</form>
 
</source>
 
  
This assumes that <code>$_SESSION['token']</code> contains an alphanumeric string that was randomly generated upon session creation.  (Just add one line of code beneath where the user successfully authenticates and you're golden.)  We can now test for validity of the CSRF token on the server side (in transfer.php):
+
<source lang="PHP">
 +
<?php
 +
$str = "If a<b and b<c then a<c.";
  
<source lang="php">
+
// Convert special characters to HTML entities before outputting:
<?php
+
echo htmlentities($str);
$destination_username = $_POST['dest'];
 
$amount = $_POST['amount'];
 
if($_SESSION['token'] !== $_POST['token']){
 
die("Request forgery detected");
 
}
 
$mysqli->query(/* perform transfer */);
 
 
?>
 
?>
 
</source>
 
</source>
  
Now, if Mother Goose were to view a page containing the malicious <img/> tag, the transfer would not take place.
+
'''Note:''' ''htmlentities'' escapes a string for use in HTML, but it does ''not'' escape a string for use in other markup languages.  You need to use different methods when escaping strings for other languages.
  
=== Real-Life Examples ===
+
=== Why Not to Escape Input ===
  
* [http://www.zdnet.com/no-data-breach-in-first-weibo-attack-2062301014/ Weibo (the Chinese Twitter), June 2011]
+
Filtering your input is important, as shown above. However, it is bad practice to ''escape'' your input. For example, don't do this:
* [http://www.huffingtonpost.com/huff-wires/20110601/us-tec-google-hacking-attack/ Gmail, June 2011]
 
* [http://www.pcworld.com/businesscenter/article/228609/hackers_steal_hotmail_messages_thanks_to_web_flaw.html Hotmail, May 2011]
 
* [http://www.theregister.co.uk/2010/05/19/facebook_private_data_leak/ Facebook, May 2010]
 
  
== Denial of Service ==
+
<source lang="PHP">
 +
<?php
 +
$message = htmlentites($_POST['amount']); // bad practice
 +
// then store $message in a database, etc.
 +
?>
 +
</source>
  
Denial of Service (DoS) is probably the most widely used attack vector to date, and the one employed by hacktivist groups like AnonymousThe concept is simple: flood a target server with more requests than it can possibly handle, resulting in server downtime.
+
The reason this is bad practice is that it permanently ties that string to its final output formatFor example, what if some time down the road you want to support display of that message in a PDF?  You'd need to go back and remove all the HTML entities again.
  
A ''Distributed'' Denial of Service (DDoS) attack is a special kind of Denial of Service attack that involves multiple, unrelated machines sending requests to the server simultaneously.  DDoS attacks are more powerful than "un-distributed" DoS attacks because there are dozens, hundreds, even thousands of machines, all with different IP addresses, all requesting data from your server at the same time; in DoS attacks, there is only one computer doing the attacking.  Hacking groups are known to have millions of machines around the world ready to perform a DDoS attack on command. It's a very interesting topic to Google about, but beware that you might spend several hours reading web sites if you start!
+
This is why you should ''filter'' strings at the input stage but not ''escape'' them until the final output stage.
  
There are two flavors that a DoS attack can take:
+
== Format String Injection ==
  
* '''Bandwidth-based:''' Saturate the connectivity link.
+
If you like using functions like '''printf''' and '''sprintf''', you may find yourself writing
* '''Packet-based:''' Saturate the processing capability of the equipment.
 
  
=== Mitigating DoS Attacks ===
+
<source lang="php">
 +
printf( "%s", htmlentities($_GET['username']) ); // good example
 +
</source>
  
Unlike the other types of attacks we've discussed (or will soon be discussing), DoS attacks cannot usually be prevented by good coding practices.  Here are some tips that should help:
+
It is tempting to reduce this to
  
* Always keep the most up-to-date software on your server and firmware on your router (if applicable).
+
<source lang="php">
* If you represent a firm with a lot of resources, '''anycast''' may be an option.  Rather than having your site hosted in just one server, the load of your site will be shared between many different servers.  Anycast systems are expensive, but they help fend off non-hardcore DoS attacks.
+
printf( htmlentities($_GET['username']) ); // BAD example
* You can set up a ''constellation'' of reverse proxy nodes.  You might also benefit from using a web server like Nginx instead of Apache.  For more information on constellation reverse proxy nodes, see: http://blog.unixy.net/2010/08/the-penultimate-guide-to-stopping-a-ddos-attack-a-new-approach/
+
</source>
* Limit things like file upload size and CGI scripts.  These are known to be easy targets for DoS attacks.
 
 
 
In short, DoS attacks are not pretty, and there's not any sure-fire way to prevent them.  Just do your best and hope that you don't get attacked by DoS.
 
 
 
=== Real-Life Examples ===
 
 
 
* [http://www.theregister.co.uk/2012/05/21/india_anonymous_cert_ddos/ Indian CERT, May 2012 (DDoS)]
 
* [http://mashable.com/2012/05/20/anonymous-hackers-police-website/ Chicago Police Department, May 2012 (DDoS)]
 
* [http://torrentfreak.com/pirate-bay-under-ddos-attack-from-unknown-enemy-120516/ Pirate Bay, May 2012 (DDoS)]
 
 
 
== Packet Sniffing ==
 
  
HTTP Packet Sniffing is a fundamental web attack that has been known for a long time, but it was never widely exploitedEssentially, the attacker can listen on his current WiFi connection for packets going in and out, then either act as a "man in the middle" to either perpetuate a Content Spoofing attack or just hijack the victim's session(This is when user agent testing would prove helpful.)  The Firesheep plugin for Firefox makes it easy to perform Packet Sniffing attacks yourself: just go to Starbuck's, open up Firesheep, and you can hijack anyone's session who is on the same public WiFi.  Scary.
+
Although the second implementation will work for most usernames, it is '''''not''''' correct!  You are essentially making the client-provided username the ''format string'' for printfIf the username contains any percentage sign (%), it will be interpreted as the start of a parameter in the format string, causing your script to return errorsWorse yet, it is known that certain combinations of format parameters will actually reveal system information in the error log.
  
 
=== Solution ===
 
=== Solution ===
  
The best and easiest way to prevent packet sniffing is to secure your site with an SSL certificate ('''https''')This will cause each request to perform handshakes, preventing Man-in-the-Middle attacks.  Unfortunately, SSL certificates are not free; you can expect to pay around $100 per year for a small web siteBecause of the handshakes, they also consume slightly more resources.  However, if your site controls sensitive data from users (e.g. credit card information), an SSL certificate is a must.
+
The solution is simple: never put dynamic input as the format stringIt should always be static, either hard-coded or from a stable source like a YAML fileUser-supplied input should ''always'' be fed into the string as arguments to sprintf and printf.
  
=== Real-Life Examples ===
+
If you are outputting only one little string like in the example above, it suffices to use a PHP function like '''print''' or '''echo''':
  
Any web site that does not use the HTTPS protocol is vulnerable to packet sniffing attacks.  After the release of applications like [http://www.redmondpie.com/faceniff-app-makes-it-easy-to-hack-facebook-twitter-and-youtube-accounts-from-android-phones/ Faceniff] and [http://www.pcworld.com/article/209333/how_to_hijack_facebook_using_firesheep.html Firesheep], high-profile sites have switched to using the HTTPS protocol by default:
+
<source lang="php">
 
+
print htmlentities($_GET['username']); // good example
* [http://articles.economictimes.indiatimes.com/2010-01-15/news/27620470_1_gmail-encryption-email-service Gmail, January 2010]
+
echo htmlentities($_GET['username']); // good example
* [http://archive.techtree.com/techtree/jsp/article.jsp?article_id=114302&cat_id=643 Facebook, January 2011]
+
</source>
* [http://archive.techtree.com/techtree/jsp/article.jsp?article_id=114805&cat_id=643 Twitter, March 2011]
 
  
 
== Server Configurations ==
 
== Server Configurations ==
Line 123: Line 106:
 
* Use a firewall system to block unnecessary ports from public access.  SSH and Web Server should really be the only ports you need.  You should keep the web serve on port 80, but you have the option of moving SSH to a port other than 22 to make it slightly more secure.
 
* Use a firewall system to block unnecessary ports from public access.  SSH and Web Server should really be the only ports you need.  You should keep the web serve on port 80, but you have the option of moving SSH to a port other than 22 to make it slightly more secure.
  
=== Real-Life Examples ===
+
=== Git Exposed ===
  
* [http://news.techeye.net/internet/4chan-vandalises-tea-party-website-reveals-private-donors Independence Hall Tea Party PAC, May 2012 (their root password was "p9ssw0rd")]
+
Another thing to keep in mind is that by default, Apache serves up ''everything'' in your file tree, only except for Apache-specific configuration files like .htaccess.  This means that if you're not careful, your .git directory can be served, exposing attackers to your raw source code, including things like database passwords! A recent study found that [http://www.jamiembrown.com/blog/one-in-every-600-websites-has-git-exposed/ one in every 600 web sites is making this mistake]. Don't be one of them!
  
  
 
[[Category:Module 2]]
 
[[Category:Module 2]]
 
[[Category:Web Application Security]]
 
[[Category:Web Application Security]]

Latest revision as of 21:05, 18 July 2018

Application-level web security is of increasing concern among web developers. This article outlines some types of security threats to your web application and how to solve those threats.

This is Part 1 of the Web Application Security article, geared toward the material covered in Module 2. For material covered in Module 3 (MySQL), see Web Application Security, Part 2. For material covered in Module 6 (JavaScript), see Web Application Security, Part 3.

Introduction to Application-Level Web Security

Every day, computer hackers around the world penetrate web applications, often for personal profits. You may find it hard to believe, but even high-profile web sites (banks, social media, even computer security companies) are vulnerable to application-level attacks!

Not only is it embarrassing to be the programmer who wrote the vulnerable code, but it could also cost you your job. As a prudent web developer, it is imperative that you take precautionary measures to make your application difficult to penetrate. Indeed, most of the time, if your site is well-written, hackers will just move on.

Here's the golden rule: Anything in your site that accepts user input, whether via a form, an AJAX request, a file upload, or even malformed links, can be used as an attack vector. NEVER TRUST USER INPUT!!! This can be summarized in the acronym FIEO, or Filter Input, Escape Output.

FIEO in PHP

Filtering Input

"Filter Input" means that you should check that input data is of the format that you are expecting. For example, if you are expecting a number, you should cast it to a float or an int. If you are expecting a phone number, you should run it through a regular expression (you will learn regular expressions in module 4). For example:

<?php
// Cast a number to a float or an int:
$amount = (float) $_POST['amount'];

// Pass a phone number through a regular expression:
$phone = preg_match('/\d{3}-\d{3}-\d{4}/', $_POST['phone']) ? $_POST['phone'] : "";
?>

Escaping Output

"Escape Output" means that you need to nullify, or escape, characters that have special meaning in the markup language of interest. For example, consider the following string:

If a<b and b<c then a<c.

Since a less-than sign means the start of a tag in HTML, and b is a valid tag name, the above string will not render as you might expect in HTML. Therefore, we need to escape our less-than signs by using HTML entities:

If a&lt;b and b&lt;c then a&lt;c.

The "<" is an HTML entity that will render as a less-than sign. (For more information on HTML entities, read this article on the WebPlatform wiki.)

PHP provides a function that, given a string, will convert special characters to their HTML entity equivalents.

<?php
$str = "If a<b and b<c then a<c.";

// Convert special characters to HTML entities before outputting:
echo htmlentities($str);
?>

Note: htmlentities escapes a string for use in HTML, but it does not escape a string for use in other markup languages. You need to use different methods when escaping strings for other languages.

Why Not to Escape Input

Filtering your input is important, as shown above. However, it is bad practice to escape your input. For example, don't do this:

<?php
$message = htmlentites($_POST['amount']); // bad practice
// then store $message in a database, etc.
?>

The reason this is bad practice is that it permanently ties that string to its final output format. For example, what if some time down the road you want to support display of that message in a PDF? You'd need to go back and remove all the HTML entities again.

This is why you should filter strings at the input stage but not escape them until the final output stage.

Format String Injection

If you like using functions like printf and sprintf, you may find yourself writing

printf( "%s", htmlentities($_GET['username']) ); // good example

It is tempting to reduce this to

printf( htmlentities($_GET['username']) ); // BAD example

Although the second implementation will work for most usernames, it is not correct! You are essentially making the client-provided username the format string for printf. If the username contains any percentage sign (%), it will be interpreted as the start of a parameter in the format string, causing your script to return errors. Worse yet, it is known that certain combinations of format parameters will actually reveal system information in the error log.

Solution

The solution is simple: never put dynamic input as the format string. It should always be static, either hard-coded or from a stable source like a YAML file. User-supplied input should always be fed into the string as arguments to sprintf and printf.

If you are outputting only one little string like in the example above, it suffices to use a PHP function like print or echo:

print htmlentities($_GET['username']); // good example
echo htmlentities($_GET['username']); // good example

Server Configurations

Sometimes hackers attempt to penetrate your application from the server side rather than the application side. Server-side security is beyond the realm of this course, but here are some things you should keep in mind.

  • Use a highly secure root password, and it should be one that you don't use anywhere else. Seriously.
  • Use a firewall system to block unnecessary ports from public access. SSH and Web Server should really be the only ports you need. You should keep the web serve on port 80, but you have the option of moving SSH to a port other than 22 to make it slightly more secure.

Git Exposed

Another thing to keep in mind is that by default, Apache serves up everything in your file tree, only except for Apache-specific configuration files like .htaccess. This means that if you're not careful, your .git directory can be served, exposing attackers to your raw source code, including things like database passwords! A recent study found that one in every 600 web sites is making this mistake. Don't be one of them!