The following function breaks with the regexp I've provided in the $pattern variable. If I change the regexp I'm fine, so I think that's the problem. I'm not seeing the problem, though, and I'm not receiving a standard PHP error even though they're turned on.
function parseAPIResults($results){
//Takes results from getAPIResults, returns array.
$pattern = '/\[(.|\n)+\]/';
$resultsArray = preg_match($pattern, $results, $matches);
}
Firefox 6: The connection was reset
Chrome 14: Error 101 (net::ERR_CONNECTION_RESET): The connection was
reset.IE 8: Internet Explorer cannot display the webpage
UPDATE:
Apache/PHP may be crashing. Here's the Apache error log from when I run the script:
[Sat Oct 01 11:41:40 2011] [notice] Parent: child process exited with
status 255 — Restarting.
[Sat Oct 01 11:41:40 2011] [notice]
Apache/2.2.11 (Win32) PHP/5.3.0 configured — resuming normal
operations
Running WAMP 2.0 on Windows 7.
Best Answer
Simple question. Complex answer!
Yes, this class of regex will repeatably (and silently) crash Apache/PHP with an unhandled segmentation fault due to a stack overflow!
Background:
The PHP
preg_*
family of regex functions use the powerful PCRE library by Philip Hazel. With this library, there is a certain class of regex which requires lots of recursive calls to its internalmatch()
function and this uses up a lot of stack space, (and the stack space used is directly proportional to the size of the subject string being matched). Thus, if the subject string is too long, a stack overflow and corresponding segmentation fault will occur. This behavior is described in the PCRE documentation at the end under the section titled: pcrestack.PHP Bug 1: PHP sets:
pcre.recursion_limit
too large.The PCRE documentation describes how to avoid a stack overflow segmentation fault by limiting the recursion depth to a safe value roughly equal to the stack size of the linked application divided by 500. When the recursion depth is properly limited as recommended, the library does not generate a stack overflow and instead gracefully exits with an error code. Under PHP, this maximum recursion depth is specified with the
pcre.recursion_limit
configuration variable and (unfortunately) the default value is set to 100,000. This value is TOO BIG! Here is a table of safe values ofpcre.recursion_limit
for a variety of executable stack sizes:Thus, for the Win32 build of the Apache webserver (
httpd.exe
), which has a (relatively small) stack size of 256KB, the correct value ofpcre.recursion_limit
should be set to 524. This can be accomplished with the following line of PHP code:When this code is added to the PHP script, the stack overflow does NOT occur, but instead generates a meaningful error code. That is, it SHOULD generate an error code! (But unfortunately, due to another PHP bug,
preg_match()
does not.)PHP Bug 2:
preg_match()
does not return FALSE on error.The PHP documentation for
preg_match()
says that it returns FALSE on error. Unfortunately, PHP versions 5.3.3 and below have a bug (#52732) wherepreg_match()
does NOT returnFALSE
on error (it instead returnsint(0)
, which is the same value returned in the case of a non-match). This bug was fixed in PHP version 5.3.4.Solution:
Assuming you will continue using WAMP 2.0 (with PHP 5.3.0) the solution needs to take both of the above bugs into consideration. Here is what I would recommend:
pcre.recursion_limit
to a safe value: 524.preg_match()
returns anything other thanint(1)
.preg_match()
returnsint(1)
, then the match was successful.preg_match()
returnsint(0)
, then the match was either not successful, or there was an error.Here is a modified version of your script (designed to be run from the command line) that determines the subject string length that results in the recursion limit error:
When you run this script, it provides a continuous readout of the current length of the subject string. If the
pcre.recursion_limit
is left at its too high default value, this allows you to measure the length of string that causes the executable to crash.Comments:
preg_match()
fails to returnFALSE
when an error occurs in the PCRE library. This bug certainly calls into question a LOT of code that usespreg_match
! (I'm certainly going to do an inventory of my own PHP code.)httpd.exe
) is built with a stacksize of 256KB. The PHP command line executable (php.exe
) is built with a stacksize of 8MB. The safe value forpcre.recursion_limit
should be set in accordance with the executable that the script is being run under (524 and 16777 respectively).pcre.recursion_limit
to a safe value.preg_match()
bugfix to PHP version 5.2.httpd.exe
executable. (This works under XP but Vista and Win7 might complain.)