Skip to content

ext/mbstring: stack overflow in mb_guess_encoding called via mb_detect_encoding #21223

@jordikroon

Description

@jordikroon

Description

The following code:

<?php
$str = "hello";

$list = [];
for ($i = 0; $i < 500000; $i++) {
    $list[] = "UTF-8";
}

var_dump(mb_detect_encoding($str, $list, false));

Resulted in this output:

AddressSanitizer:DEADLYSIGNAL
=================================================================
==60542==ERROR: AddressSanitizer: stack-overflow on address 0x00016a9db900 (pc 0x00010d2f06e4 bp 0x00016bceea30 sp 0x00016a9db940 T0)
    #0 0x00010d2f06e4 in __asan_alloca_poison+0x4 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x106e4)
    #1 0x0001054bac50 in mb_guess_encoding mbstring.c:3453
    #2 0x0001054ccbd8 in zif_mb_detect_encoding mbstring.c:3514
    #3 0x000108a609e8 in ZEND_DO_ICALL_SPEC_RETVAL_USED_TAILCALL_HANDLER zend_vm_execute.h:54104
    #4 0x000108024db4 in execute_ex zend_vm_execute.h:110065
    #5 0x0001080262f8 in zend_execute zend_vm_execute.h:115483
    #6 0x0001091e25b8 in zend_execute_script zend.c:1980
    #7 0x00010752ae9c in php_execute_script_ex main.c:2648
    #8 0x00010752b640 in php_execute_script main.c:2688
    #9 0x0001091f07e8 in do_cli php_cli.c:949
    #10 0x0001091ed6e4 in main php_cli.c:1360
    #11 0x00019b305d50 in start+0x1c0c (dyld:arm64e+0x3d50)

SUMMARY: AddressSanitizer: stack-overflow mbstring.c:3453 in mb_guess_encoding
==60542==ABORTING

But I expected this output instead:

string(5) "UTF-8"

PHP Version

PHP 8.5.1 (cli) (built: Dec 16 2025 15:59:07) (ZTS)
Copyright (c) The PHP Group
Built by Shivam Mathur
Zend Engine v4.5.1, Copyright (c) Zend Technologies
    with Zend OPcache v8.5.1, Copyright (c), by Zend Technologies

Operating System

No response

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions