MOPS Submission 07: Our Dynamic PHP – Obvious and not so obvious PHP code injection and evaluation

May 20th, 2010

Today we want to present you the seventh external MOPS submission. It is an article about usual and unusual PHP code execution vulnerabilities sent in by Arthur Gerkis.

Our Dynamic PHP

Obvious and not so obvious PHP code injection and evaluation

Arthur Gerkis, 2010-04-17

Table of Contents

1. Abstract

We all know that PHP is a language that allows entering programmers with low coding skills and as a rule with poor knowledge of basic security concepts. This factor often leads to new poorly written web-applications thus compromising servers which host them. While such applications are widespread, today we have some core of well-known web-applications, which are possibly more secure. “Possibly” is because of their history of bugs and security holes. Nevertheless, today common PHP application security is growing increasingly and we can say thanks for this to security researchers’ investigation, robust web application frameworks, PHP interpreter maturity and solutions like “Suhosin” patch.

Such vulnerabilities like SQL injections, cross-site scripting, cross-site request forgery, local/remote file inclusions, directory traversal and others are well-known, some of them become instinct and some of them are given new live and power due to newly discovered security flaws. But as we might notice from security bug-tracks, there still remain some things developers forgot about or even did not know. In this article I would like to focus attention of developers on PHP code execution (evaluation) in places which are less possible to meet and pieces of code that might look quite innocent while providing possibilities for attacker to evaluate their code. For completeness of observe I will mention also old, well-known security breaches.

2. Ways to Evaluate

There are a lot of different ways how to cause code evaluation. Several PHP built-in functions allow to change code dynamically by evaluating some expression, some tricky language constructions and specific PHP features can cause evaluation. So, eval() is not the only way and deeper in this article we will go through all of possibilities.

If vulnerability exists and attacker knows the source code, then only result should be checked and this depends on what is the aim of attacker. But sometimes attacker cannot see result, this is so called “blind” code evaluation – this happens when code evaluates hidden. This does not decreases security risk, even more, in some cases it turns out to be very harmful. Attacker can use so-called “fuzzing”, brute-force method against target. And no one knows what will happen if one of brute-force enumerations succeeds. Anyway, any unpredicted code evaluation is not acceptable.

I will start from most frequent and obvious PHP code inclusion cases going deeper till things that someone could not guess.

2.1. Well-known cases

2.1.1. Evaluating eval()

The most first case is PHP construction that was directly meant for evaluation- eval(). Usually web-developers want to evaluate desired code with some dynamic changes. Such approach is often used in template or plugin systems. While it looks quite obvious to be insecure, today there are still a lot of applications that are prone to this kind of vulnerability – it happens to be really tricky to check and filter all the possibilities. What about practical exposure – look at the following lines:

<?php
eval("echo $foobar;");
?>

In this case variable will be treated as PHP code, so contents of $foobar will be evaluated.

<?php
eval('echo $foobar;');
?>

In second one value of $foobar will be simply echoed. So, if your intention was second case, there is no reason to use double quotes. And do not forget, that variable that was created during eval() call, will remain visible to all native code. In several circumstances this can posess security risk

How it was said, it is hard to make perfect escaping for eval(). There are no functions that can help to properly escape input spcially for eval() – there are simply too much possibilities what can and should be happen on the flow. It is dynamic code “feature”. If there is no way to avoid eval(), try to use string literals, avoid interpolation of variables. When there is need of variable interpolation, then it should be initialized. Also good approach to catch the moment and identify that something has went wrong is to use own error handler.

2.1.2. Code Inclusion

File inclusion vulnerabilities were also the most popular and dangerous. File could be included into initial PHP script by following statements:

  • include(), include_once()
  • require(), require_once()

File that has been included will be interpreted as PHP source code.

There are two categories of this attack – local file inclusion called LFI and remote file inclusion, as you can guess, RFI (further explanation see in reference Nr.2). Today web-developers became more accurate, but still LFI/RFI sometimes happens. Best way to be preserved from this kind of attack is to avoid using dynamic paths. If this is not possible, then the usage of this should be limited and checked by the list of allowed files to be included. Also try to use full path rather than partial. But if the PHP directive include path is able to be modified, you can never know where the script with defined partial path comes from. Good approach is to use file inclusion as follows:

<?php
define('APP_PATH', '/var/www/htdocs/');
require_once(APP_PATH . 'lib.php');
?>

However, existence of local or remote file is not always required. If it is possible to use remote including then attacker can use embedded code in URL. The talk is about “data URI”, is defined in RFC 2397. Consider we have vulnerable site and it uses such file inclusion:

http://www.example.com/index.php?file=page1

We can guess that following code is used to include files:

<?php
$to_include = $_GET['file'];
require_once($to_include . '.html');
?>

And now imagine that attacker changes value of variable “file” to following:

http://www.example.com/index.php?file=data:text/plain,<?php phpinfo();?>%00

This will merely lead to PHP code execution. Thus, today LFI can easily be converted to remote code execution (RCE) in one way more. This new data protocol has appeared in PHP 5.2.0 and in older versions will not work. Also PHP will argue and would not allow to use it if allow_url_include=off. Excessive information about “data URI” is possible to get in references Nr. 3 and Nr.4.

Sure, there remain other possibilities how PHP code can be injected and later evaluated – via apache log files, using “/proc” and others. As for example, see references Nr. 5 and Nr.6. There is good explanation of different techniques to exploit this sort of vulnerability. Without doubt, inappropriate usage of functions like file_get_contents(), readfile(), input wrappers like php://input and others is a threat. We will not discuss them because of their secondary prevalence. Besides, they abide the same filtration rules as for all other input.

2.2. Regular Expression

Another one popular case is code evaluation in regular expression (“regexp” further). Regexps are used widely because it is often easier to write regexp than to work with several string parsing functions. It saves place and time.
Since PHP has support of PCRE (Perl Compatible Regular Expressions) there is available “e (PREG_REPLACE_EVAL)” modifier in regexp for one function – preg_replace(). When match is found, then it will be evaluated. Look at the following code:

<?php
$var = '<tag>phpinfo()</tag>';
preg_replace("/<tag>(.*?)<\/tag>/e", 'addslashes(\\1)', $var);
?>

Most likely, intension of developer was sanitization of input with addslashes(). But attacker’s thoughts do not coincide. Here phpinfo() would execute.

However, even if there is no “e” modifier sometimes attackers still have possibility to evaluate code. It can be achieved by dropping off some part of regexp by putting null-byte into it. Let’s look at the same example, but a little bit modified:

<?php
$regexp = $_GET['re'];
$var = '<tag>phpinfo()</tag>';
preg_replace("/<tag>(.*?)$regexp<\/tag>/", '\\1', $var);
?>

Maybe this example looks too naive, but currently aim is to show when null-byte attack could work. Now consider that vulnerable script accepts request like this:

http://www.example.com/index.php?re=<\/tag>/e%00

This would modify original regular expression and make code to evaluate. With magic_quotes_gpc=on this would not work anymore. However, it was decided to remove this directive in PHP version 6.0 because it has brought a lot of problems to web-developers. Proper input filtration is not the work of PHP interpreter, only web-developer is responsible for this.

Double quotes should not be used without need, safer is to use single quotes. If possible, then usage of similar function, preg_replace_callback(), is better. The difference is only in call of callback function instead of replacement. And again, beware of callback if it has impact outside from PHP. One of filtration methods is using function preg_quote(). It would escape regex special characters thus saving us from confusion.

2.3. Dynamic Code

By dynamic code here is meant everything that can change normal code execution flow – dynamic variables (variable variables), new functions creation on the fly and complex curly syntax.

2.3.1. Dynamic Variables

PHP allows programmer to use “variable variables” in their code. In this case name of variable is set dynamically. Sometimes to preserve compatibility with old code and due to unavailability to modify previous source, programmers (bad programmers) uses such register_globals=on imitation:

<?php
foreach ($_GET as $key => $value) {
   $$key = $value;
}
// ... some code
if (logged_in() || $authenticated) {
   // ... administration area
}
?>

This could be convenient not only to beginners or lazy coders, but also attacker. Imagine that attacker provides such string to application:

http://www.example.com/index.php?authenticated=true

In combination with insufficient authentication check as shown in example (what is like a rule in applications with vulnerabilities of such level) it would give access to restricted area.
While such approach became part of history, it still happens to see such code in applications. Values should be initialized and scope of variables should be overlooked. By default register_globals was switched off in PHP 4.2.0 and will be removed in 6.0 PHP.

2.3.2. Dynamic Functions

This could be most popular case of vulnerability within this category. It is possible to create dynamic functions at least in two ways. First is as follows:

<?php
$dyn_func = $_GET['dyn_func'];
$argument = $_GET['argument'];
$dyn_func($argument);
?>

Or, if register_globals=on then previous code is equal to foregoing:

<?php
$dyn_func($argument);
?>

And if we call script such way:

http://www.example.com/index.php?dyn_func=system&argument=uname

Here variable $dyn_func becomes name of function and $argument is argument. It should be clear what it could end up with.

In second case it is possible to exploit this code even without function name. With create_function() it is possible to create anonymous function, that will execute second argument as contents of new function. Example of vulnerable code:

<?php
$foobar = $_GET['foobar'];
$dyn_func = create_function('$foobar', "echo $foobar;");
$dyn_func('');
?>

Then following request would give out result of system command “ls” execution:

http://www.example.com/index.php?foobar=system('ls')

If to compare to eval(), then it would look something like this:

<?php
eval("function lambda_n() { echo system('ls'); }");
lambda_n();
?>

Reason for this is that create_function() function is simply PHP internal wrapper that uses eval.

2.3.3. Curly syntax

Complex curly syntax was meant to separate code from strings, actually, embed it. As in PHP manual pages written, “complex” means that it would allow to use complex expressions. Usually developers use that functionality such way:

<?php
$year = "10";
$foobar = "That was 20{$year}-th year.";
?>

This way it is possible to merge text in strings with variable values. And now imagine this case:

<?php
$var = "I was innocent until ${`ls`} appeared here";
?>

Here system command “ls” will be executed. But why this has happened? The thing is that code between curly braces will be evaluated and result will replace {`ls`} thus creating variable. That is why if we run this code, PHP would be complaining of undefined variable that consists of directory listing. The same case as if it was:

<?php
eval("$foo");
?>

Here “foo” is result of command “ls” (as a string). And something crazy like this will also work:

<?php
$foobar = 'phpinfo';
${'foobar'}();
?>

Whilst this does not posess any security risk when used alone, curly syntax can help to evade in preg_replace() and other places when code is mixed with strings. Most exploits that uses regexp vulnerability are using this trick.

2.4. Rare but possible cases

This might not be so spread pitfall of web-developers due to different kind of usage of following functions. In this case it is often not required to interact with user input so actively as in previous examples. But anyway, because these functions are too many they are worth mention.

ob_start() function can take argument as callback function that will be executed when output buffer is flushed. Contents of that was printed out will be passed to callback function as argument. Usually output buffering is used for data compression that is passed back to browser. Let’s consider following malicious usage:

<?php
$foobar = 'system';
ob_start($foobar);
echo 'uname';
ob_end_flush();
?>

In case if $foobar is controllable then it would execute system command “uname”.

Function assert() is also easy to exploit as eval(), but it is not so common security hole. It
accepts string as argument that will be evaluated. The following code has the same effect as previous:

<?php
$foobar = 'system("uname")';
assert($foobar);
?>

This function should be used in development code to help in debugging. And still developers often abuses usage of assert().

Developers can use array functions to apply some list of properties to data or vice versa. In most cases applying functions are predefined, so this vulnerability is also not so popular in the wild. Anyway, it is still worth mention due to potential possibilities:

  • array_map()
  • usort(), uasort(), uksort()
  • array_filter()
  • array_reduce()
  • array_diff_uassoc(), array_diff_ukey()
  • array_udiff(), array_udiff_assoc(), array_udiff_uassoc()
  • array_intersect_assoc(), array_intersect_uassoc()
  • array_uintersect(), array_uintersect_assoc(), array_uintersect_uassoc()
  • array_walk(), array_walk_recursive()

As example let’s take one function of this list and exploit it:

<?php
$evil_callback = $_GET['callback'];
$some_array = array(0, 1, 2, 3);
$new_array = array_map($evil_callback, $some_array);
?>

Now attacker passes this to browser:

http://www.example.com/index.php?callback=phpinfo

As a result, callback defined by attacker was applied to the whole array and he will get phpinfo() executed.

In continue of functions that use callbacks we can mention XML Parser functions. By default these functions are enabled in PHP:

  • xml_set_character_data_handler()
  • xml_set_default_handler()
  • xml_set_element_handler()
  • xml_set_end_namespace_decl_handler()
  • xml_set_external_entity_ref_handler()
  • xml_set_notation_decl_handler()
  • xml_set_processing_instruction_handler()
  • xml_set_start_namespace_decl_handler()
  • xml_set_unparsed_entity_decl_handler()

Some other often used functions that were not mentioned but also uses callbacks:

  • stream_filter_register()
  • set_error_handler()
  • register_shutdown_function()
  • register_tick_function()

Here were mentioned functions that might appear in applications, but there still remains huge amount of undocumented, deprecated functions, extensions that uses callback functions or otherwise let generate dynamic code. It is nearly impossible to remember them all. Moreover, they can appear and disappear in different PHP releases. So, easier and safer way is to properly sanitize input and to think about what can influence the value of callback as argument.

2.5. Miscellaneous and not the last

We have discussed a lot of possibilities of how evil code can be injected into native one. But the game is not over. It should not be also forgotten about some complicated cases that are related to bugs of PHP interpreter itself and several conjunctures. Until such bug is not discovered and publicly disclosed, fails everyone.

One bright example is relatively recently discovered insecure behavior of unserialize() function with combination of classes destructors. In short, if we unserialize object of some class, then __destruct() of this class will be called. If it is possible to send specially crafted serialized string, then attacker is able to execute arbitrary code. Some oversimplified example:

<?php
class Example {
   var $var = '';
   function __destruct() {
      eval($this->var);
   }
}
unserialize($_GET['saved_code']);
?>

And the following link would execute desired code, in our case, phpinfo():

http://www.example.com/index.php?saved_code=O:7:"Example":1:{s:3:"var";s:10:"phpinfo();";}

Sure, there are some circumstances needed that would allow to pass this string to object methods and it is not so easy to find exploitable place, but there are exploits in the wild based on this behavior. So, it should be taken into account. For better and original explanation of this vulnerability look reference Nr.7.

3. To sum up

As you see, this problem is rather actual. If you decided to allow user to embed some code then think of such embedded code as if it were always executing dangerous functions – simply never trust user and even administrator. It is very hard to correctly escape user input because there always remains possibility that something has been forgotten. That is why using blacklists is generally bad idea. Much easier and safer is to define whitelist of allowed functions, tags, whatever you wish to allow for input. However this is not also a guarantee of security. More static code – more safety.

If you are about writing portable and secure code, then never rely on PHP configuration – it could change every time your application leaves development platform. As for example, if your PHP is configured with safe_mode=on, magic_quotes_gpc=on, register_globals=off and etc., it does not mean that application will remain always bulletproof and stable. If in PHP 5.0 Zend Engine developer forces were concentrated on OOP, then in PHP 6.0 it is security, better support of Unicode and more configuration independent work. In article were mentioned some cases what will be removed in new PHP, so be ready for broken apps if you do not take into account those changes.

Maybe you have noticed, but everything that is convenient to developer turns out to be a security risk. That is some kind of pay for such properties that should make life easier. More static code you use, less possibilities to get exploited. We get one – faster development, but loose other – control over application. Beware of it.

4. References

#1 http://www.hardened-php.net/suhosin/ Suhosin, advanced protection system for PHP
#2 http://projects.webappsec.org/Remote-File-Inclusion explanation of RFI
#3 http://tools.ietf.org/html/rfc2397 The “data” URL scheme
#4 http://www.php.net/manual/en/wrappers.data.php Data (RFC 2397), PHP manual
#5 http://www.ush.it/2008/08/18/lfi2rce-local-file-inclusion-to-remote-code-execution-advanced- exploitation-proc-shortcuts/ how LFI can lead to RCE
#6 http://www.exploit-db.com/papers/260 how LFI can lead to RCE (2)
#7 https://www.sektioneins.de/en//en/advisories/index.html unserialize() based advisories and many others



blog comments powered by Disqus