8000 zend_compile: Optimize `sprintf()` into a rope by TimWolla · Pull Request #14546 · php/php-src · GitHub
[go: up one dir, main page]

Skip to content
< 8000 div id="partial-discussion-header" class="gh-header mb-3 js-details-container Details js-socket-channel js-updatable-content pull request js-pull-header-details" data-channel="eyJjIjoicHVsbF9yZXF1ZXN0OjE5MTY0NjQyNTkiLCJ0IjoxNzUzNTI3NTM2fQ==--481637db29cf031fdf8cba725af11bd07fc2902db3b43bc47afde6ff0241dda5" data-url="/php/php-src/pull/14546/partials/title?sticky=false" data-channel-event-name="title_updated" data-pull-is-open="false" data-gid="PR_kwDOAB0Los5yOuyD">

zend_compile: Optimize sprintf() into a rope #14546

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 13, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
zend_compile: Optimize sprintf() into a rope
This optimization will compile `sprintf()` using only `%s` placeholders into a
rope at compile time, effectively making those calls equivalent to the use of
string interpolation, with the added benefit of supporting arbitrary
expressions instead of just expressions starting with a `$`.

For a synthetic test using:

    <?php

    $a = 'foo';
    $b = 'bar';

    for ($i = 0; $i < 100_000_000; $i++) {
    	sprintf("%s-%s", $a, $b);
    }

This optimization yields a 2.1× performance improvement:

    $ hyperfine 'sapi/cli/php -d zend_extension=php-src/modules/opcache.so -d opcache.enable_cli=1 test.php' \
          '/tmp/unoptimized -d zend_extension=php-src/modules/opcache.so -d opcache.enable_cli=1 test.php'
    Benchmark 1: sapi/cli/php -d zend_extension=php-src/modules/opcache.so -d opcache.enable_cli=1 test.php
      Time (mean ± σ):      1.869 s ±  0.033 s    [User: 1.865 s, System: 0.003 s]
      Range (min … max):    1.840 s …  1.945 s    10 runs

    Benchmark 2: /tmp/unoptimized -d zend_extension=php-src/modules/opcache.so -d opcache.enable_cli=1 test.php
      Time (mean ± σ):      4.011 s ±  0.034 s    [User: 4.006 s, System: 0.005 s]
      Range (min … max):    3.964 s …  4.079 s    10 runs

    Summary
      sapi/cli/php -d zend_extension=php-src/modules/opcache.so -d opcache.enable_cli=1 test.php ran
        2.15 ± 0.04 times faster than /tmp/unoptimized -d zend_extension=php-src/modules/opcache.so -d opcache.enable_cli=1 test.php

This optimization comes with a small and probably insignificant behavioral
change: If one of the values cannot be (cleanly) converted to a string, for
example when attempting to insert an object that is not `Stringable`, the
resulting Exception will naturally not show the `sprintf()` call in the
resulting stack trace, because there is no call to `sprintf()`.

Nevertheless it will correctly point out the line of the `sprintf()` call as
the source of the Exception, pointing the user towards the correct location.
  • Loading branch information
TimWolla committed Jun 12, 2024
commit 9fc7a7328c4236cf1e70059e108cb47ef09b0d92
167 changes: 167 additions & 0 deletions Zend/zend_compile.c
10000
Original file line number Diff line number Diff line change
Expand Up @@ -4712,6 +4712,171 @@ static void zend_compile_ns_call(znode *result, znode *name_node, zend_ast *args
}
/* }}} */

static zend_op *zend_compile_rope_add(znode *result, uint32_t num, znode *elem_node);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benchmark shows decreased performance:

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain how this change would be able to affect bench.php.

Please also explain how the addition of two comments to the source code significantly changed the results compared to these: https://github.com/php/php-src/actions/runs/9484406283

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the performance degradation is comming from #14054 and the base branch used is detected wrongly?

static zend_op *zend_compile_rope_add_ex(zend_op *opline, znode *result, uint32_t num, znode *elem_node);
static void zend_compile_rope_finalize(znode *result, uint32_t j, zend_op *init_opline, zend_op *opline);

static zend_result zend_compile_func_sprintf(znode *result, zend_ast_list *args) /* {{{ */
{
if (args->children < 1) {
return FAILURE;
}

zend_eval_const_expr(&args->child[0]);
if (args->child[0]->kind != ZEND_AST_ZVAL) {
return FAILURE;
}

zval *format_string = zend_ast_get_zval(args->child[0]);
if (Z_TYPE_P(format_string) != IS_STRING) {
return FAILURE;
}
if (Z_STRLEN_P(format_string) >= 256) {
return FAILURE;
}

char *p;
char *end;
uint32_t string_placeholder_count;

string_placeholder_count = 0;
p = Z_STRVAL_P(format_string);
end = p + Z_STRLEN_P(format_string);

for (;;) {
p = memchr(p, '%', end - p);
if (!p) {
break;
}

char *q = p + 1;
if (q == end) {
return FAILURE;
}

switch (*q) {
case 's':
string_placeholder_count++;
break;
case '%':
break;
default:
return FAILURE;
}

p = q;
p++;
}

/* Bail out if the number of placeholders does not match the number of values. */
if (string_placeholder_count != (args->children - 1)) {
return FAILURE;
}

znode *elements = NULL;

if (string_placeholder_count > 0) {
elements = safe_emalloc(sizeof(*elements), string_placeholder_count, 0);
}

/* Compile the value expressions first for error handling that is consistent
* with a function call: Values that fail to convert to a string may emit errors.
*/
for (uint32_t i = 0; i < string_placeholder_count; i++) {
zend_compile_expr(elements + i, args->child[1 + i]);
if (elements[i].op_type == IS_CONST) {
if (Z_TYPE(elements[i].u.constant) != IS_ARRAY) {
convert_to_string(&elements[i].u.constant);
}
Comment on lines +4798 to +4800
Copy link
Member
@Girgias Girgias Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could potentially check if it is an object if it implements the Stringable interface? It would miss oddities like GMP not implementing the interface, although it can be cast to a string (which we really should fix...) but should work except if the __toString() method throws an error.

EDIT: I realized the issue is that one would need to possibly trigger autoloading to check the class....

Copy link
10000
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are compile time const expressions, they cannot currently contain objects (not even enums).

}
}

uint32_t rope_elements = 0;
uint32_t rope_init_lineno = -1;
zend_op *opline = NULL;

string_placeholder_count = 0;
p = Z_STRVAL_P(format_string);
end = p + Z_STRLEN_P(format_string);
char *offset = p;
for (;;) {
p = memchr(p, '%', end - p);
if (!p) {
break;
}

char *q = p + 1;
ZEND_ASSERT(q < end);
ZEND_ASSERT(*q == 's' || *q == '%');

if (*q == '%') {
/* Optimization to not create a dedicated rope element for the literal '%':
* Include the first '%' within the "constant" part instead of dropping the
* full placeholder.
*/
p++;
}

if (p != offset) {
znode const_node;
const_node.op_type = IS_CONST;
ZVAL_STRINGL(&const_node.u.constant, offset, p - offset);
if (rope_elements == 0) {
rope_init_lineno = get_next_op_number();
}
opline = zend_compile_rope_add(result, rope_elements++, &const_node);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: For the case of %%, the merging of constant strings would actually be beneficial within zend_compile_rope_add(). Otherwise we're compiling sprintf('%%foo') this to '%' . 'foo', essentially. Although the optimizer probably takes care of this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect %% to be fairly rare within format strings. The previous remark about “clarity” also applies here: Merging of consecutive constants should ideally happen in a generic fashion, instead of requiring every user to reimplement it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. zend_compile_encaps_list() essentially does the same thing, which is what I was referring to. Hence, moving it to zend_compile_rope_add() would make it usable for both. That said, I wasn't expecting you to do that, but just noted it as a possible follow-up, and/or a potential // TODO to be added.

}

if (*q == 's') {
/* Perform the cast of constant arrays when actually evaluating corresponding placeholder
* for correct error reporting.
*/
if (elements[string_placeholder_count].op_type == IS_CONST) {
if (Z_TYPE(elements[string_placeholder_count].u.constant) == IS_ARRAY) {
zend_emit_op_tmp(&elements[string_placeholder_count], ZEND_CAST, &elements[string_placeholder_count], NULL)->extended_value = IS_STRING;
}
}
if (rope_elements == 0) {
rope_init_lineno = get_next_op_number();
}
opline = zend_compile_rope_add(result, rope_elements++, &elements[string_placeholder_count]);

string_placeholder_count++;
}

p = q;
p++;
offset = p;
}
if (end != offset) {
/* Add the constant part after the last placeholder. */
znode const_node;
const_node.op_type = IS_CONST;
ZVAL_STRINGL(&const_node.u.constant, offset, end - offset);
if (rope_elements == 0) {
rope_init_lineno = get_next_op_number();
}
opline = zend_compile_rope_add(result, rope_elements++, &const_node);
}
if (rope_elements == 0) {
/* Handle empty format strings. */
znode const_node;
const_node.op_type = IS_CONST;
ZVAL_EMPTY_STRING(&const_node.u.constant);
if (rope_elements == 0) {
rope_init_lineno = get_next_op_number();
}
opline = zend_compile_rope_add(result, rope_elements++, &const_node);
}
ZEND_ASSERT(opline != NULL);

zend_op *init_opline = CG(active_op_array)->opcodes + rope_init_lineno;
zend_compile_rope_finalize(result, rope_elements, init_opline, opline);
efree(elements);

return SUCCESS;
}

static zend_result zend_try_compile_special_func_ex(znode *result, zend_string *lcname, zend_ast_list *args, zend_function *fbc, uint32_t type) /* {{{ */
{
if (zend_string_equals_literal(lcname, "strlen")) {
Expand Down Expand Up @@ -4778,6 +4943,8 @@ static zend_result zend_try_compile_special_func_ex(znode *result, zend_string *
return zend_compile_func_array_slice(result, args);
} else if (zend_string_equals_literal(lcname, "array_key_exists")) {
return zend_compile_func_array_key_exists(result, args);
} else if (zend_string_equals_literal(lcname, "sprintf")) {
return zend_compile_func_sprintf(result, args);
} else {
return FAILURE;
}
Expand Down
167 changes: 167 additions & 0 deletions ext/standard/tests/strings/sprintf_rope_optimization_001.phpt
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
--TEST--
Test sprintf() function : Rope Optimization
--FILE--
<?php
function func($str) {
return strtoupper($str);
}
function sideeffect() {
echo "Called!\n";
return "foo";
}
class Foo {
public function __construct() {
echo "Called\n";
}
}

$a = "foo";
$b = "bar";
$c = new stdClass();

try {
var_dump(sprintf("const"));
} catch (\Throwable $e) {echo $e, PHP_EOL; } echo PHP_EOL;

try {
var_dump(sprintf("%s", $a));
} catch (\Throwable $e) {echo $e, PHP_EOL; } echo PHP_EOL;

try {
var_dump(sprintf("%s/%s", $a, $b));
} catch (\Throwable $e) {echo $e, PHP_EOL; } echo PHP_EOL;

try {
var_dump(sprintf("%s/%s/%s", $a, $b));
} catch (\Throwable $e) {echo $e, PHP_EOL; } echo PHP_EOL;

try {
var_dump(sprintf("%s/%s/%s", $a, $b, $c));
} catch (\Throwable $e) {echo $e, PHP_EOL; } echo PHP_EOL;

try {
var_dump(sprintf("%s/", func("baz")));
} catch (\Throwable $e) {echo $e, PHP_EOL; } echo PHP_EOL;

try {
var_dump(sprintf("/%s", func("baz")));
} catch (\Throwable $e) {echo $e, PHP_EOL; } echo PHP_EOL;

try {
var_dump(sprintf("/%s/", func("baz")));
} catch (\Throwable $e) {echo $e, PHP_EOL; } echo PHP_EOL;

try {
var_dump(sprintf("%s%s%s%s", $a, $b, func("baz"), $a));
} catch (\Throwable $e) {echo $e, PHP_EOL; } echo PHP_EOL;

try {
var_dump(sprintf("%s/%s", sprintf("%s:%s", $a, $b), sprintf("%s-%s", func('baz'), func('baz'))));
} catch (\Throwable $e) {echo $e, PHP_EOL; } echo PHP_EOL;

try {
var_dump(sprintf(sideeffect()));
} catch (\Throwable $e) {echo $e, PHP_EOL; } echo PHP_EOL;

try {
var_dump(sprintf("%s-%s-%s", __FILE__, __LINE__, 1));
} catch (\Throwable $e) {echo $e, PHP_EOL; } echo PHP_EOL;

try {
$values = range('a', 'z');
var_dump(sprintf("%s%s%s", "{$values[0]}{$values[1]}{$values[2]}", "{$values[3]}{$values[4]}{$values[5]}", "{$values[6]}{$values[7]}{$values[8]}"));
} catch (\Throwable $e) {echo $e, PHP_EOL; } echo PHP_EOL;

try {
var_dump(sprintf("%s%s%s", new Foo(), new Foo(), new Foo(), ));
} catch (\Throwable $e) {echo $e, PHP_EOL; } echo PHP_EOL;

try {
var_dump(sprintf(...));
} catch (\Throwable $e) {echo $e, PHP_EOL; } echo PHP_EOL;

try {
var_dump(sprintf('%%s'));
} catch (\Throwable $e) {echo $e, PHP_EOL; } echo PHP_EOL;

try {
var_dump(sprintf('%%s', 'test'));
} catch (\Throwable $e) {echo $e, PHP_EOL; } echo PHP_EOL;

try {
var_dump(sprintf('%s-%s-%s', [], [], []));
} catch (\Throwable $e) {echo $e, PHP_EOL; } echo PHP_EOL;

try {
var_dump(sprintf(""));
} catch (\Throwable $e) {echo $e, PHP_EOL; } echo PHP_EOL;

echo "Done";
?>
--EXPECTF--
string(5) "const"

string(3) "foo"

string(7) "foo/bar"

ArgumentCountError: 4 arguments are required, 3 given in %s:32
Stack trace:
#0 %s(32): sprintf('%s/%s/%s', 'foo', 'bar')
#1 {main}

Error: Object of class stdClass could not be converted to string in %s:36
Stack trace:
#0 {main}

string(4) "BAZ/"

string(4) "/BAZ"

string(5) "/BAZ/"

string(12) "foobarBAZfoo"

string(15) "foo:bar/BAZ-BAZ"

Called!
string(3) "foo"

string(%d) "%ssprintf_rope_optimization_001.php-%d-1"

string(9) "abcdefghi"

Called
Called
Called
Error: Object of class Foo could not be converted to string in %s:73
Stack trace:
#0 {main}

object(Closure)#3 (2) {
["function"]=>
string(7) "sprintf"
["parameter"]=>
array(2) {
["$format"]=>
string(10) "<required>"
["$values"]=>
string(10) "<optional>"
}
}

string(2) "%s"

string(2) "%s"


Warning: Array to string conversion in %s on line 89

Warning: Array to string conversion in %s on line 89

Warning: Array to string conversion in %s on line 89
string(17) "Array-Array-Array"

string(0) ""

Done
26 changes: 26 additions & 0 deletions ext/standard/tests/strings/sprintf_rope_optimization_002.phpt
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
--TEST--
Test sprintf() function : Rope Optimization with a throwing error handler.
--FILE--
<?php

function exception_error_handler(int $errno, string $errstr, ?string $errfile, int $errline) {
if (!(error_reporting() & $errno)) {
// This error code is not included in error_reporting
return;
}
throw new \ErrorException($errstr, 0, $errno, $errfile, $errline);
}
set_error_handler(exception_error_handler(...));

try {
var_dump(sprintf("%s-%s", new stdClass(), []));
} catch (\Throwable $e) {echo $e, PHP_EOL; } echo PHP_EOL;

echo "Done";
?>
--EXPECTF--
Error: Object of class stdClass could not be converted to string in %s:13
Stack trace:
#0 {main}

Done
Loading
0