Skip to content

Add grapheme_strlen() example for invalid UTF-8 input #5556

@masakielastic

Description

@masakielastic

Affected page

https://www.php.net/manual/en/function.grapheme-strlen.php

Current issue

The grapheme_strlen() manual page currently documents the function signature as:

grapheme_strlen(string $string): int|false|null

However, the examples only show successful usage, and there is no example showing how to handle null or false.

This may be confusing because grapheme_strlen() expects a valid UTF-8 string. If the input contains invalid UTF-8 bytes, the function can return null.

For example:

<?php

$string = "\xFF";

var_dump(grapheme_strlen($string));

?>

output:

NULL

Users who are not familiar with invalid byte sequences may be confused by this behavior. They may expect a string length to always be returned when passing a PHP string.

Suggested improvement

Add an example showing how to handle a failure result and inspect the intl error code.

For example:

<?php

$string = "\xFF";

$length = grapheme_strlen($string);

if (!is_int($length)) {
    $code = intl_get_error_code();

    printf(
        "grapheme_strlen() failed: %s (%d)\n",
        intl_error_name($code),
        $code
    );
} else {
    echo $length, PHP_EOL;
}

?>

Expected output:

grapheme_strlen() failed: U_INVALID_CHAR_FOUND (10)

A short explanation could be added before the example:

grapheme_strlen() expects a valid UTF-8 string. If the function does not return an integer, the intl error functions can be used to inspect the ICU error code.

Additional context (optional)

The null return value can be demonstrated with invalid UTF-8 input, for example grapheme_strlen("\xFF").

The false return path appears to be much harder to reproduce from ordinary userland input, because it seems to correspond to an internal failure during grapheme boundary processing, such as ICU break iterator initialization. Therefore, the proposed example uses !is_int($length) to handle both null and false, while using invalid UTF-8 as the reproducible failure case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions