Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -513,6 +513,25 @@ gaps that must be fixed first.
- [x] Phase 7 — `RegexIterator`, `RecursiveRegexIterator`
- [x] Phase 8 — file/directory iterators: `SplFileInfo`, `SplFileObject`, `SplTempFileObject`, `DirectoryIterator`, `FilesystemIterator`, `GlobIterator`, `RecursiveDirectoryIterator`, `RecursiveCachingIterator`

### Array builtin parity (key/list helpers, associative set-ops, recursive merge/walk)

Well-bounded PHP-visible array builtins added before the backend migration. All dual-target (ARM64 + x86_64), with codegen and error tests, runtime-GC verified.

- [x] `array_key_first()` / `array_key_last()` (PHP 7.3) — first/last key in insertion order, boxed as `Mixed`, `null` for empty arrays
- [x] `array_is_list()` (PHP 8.1) — sequential `0..n-1` key check (indexed arrays are lists by construction; associative arrays walk the insertion-order chain)
- [x] `array_replace()` — right-wins key merge over associative arrays (clone + `__rt_hash_set` overwrite)
- [x] `array_replace_recursive()` — recursive right-wins merge, recursing when both values at a key are associative arrays
- [x] `array_diff_assoc()` / `array_intersect_assoc()` — key + string-cast-value comparison via a unified `__rt_assoc_diff_intersect` helper
- [x] `array_merge_recursive()` — integer-key renumbering, string-key collisions recurse (both arrays) or combine into a list (scalars)
- [x] `array_walk_recursive()` — invokes the callback on each non-array leaf, recursing through nested indexed/associative arrays
- [x] `array_find()` / `array_any()` / `array_all()` (PHP 8.4) — predicate callbacks; find returns the first match or `null`, any/all return booleans
- [x] `array_udiff()` / `array_uintersect()` — difference/intersection with a user comparator (`$cmp($a, $b) === 0`)
- [x] `array_multisort()` — sort the first indexed array ascending (stable) and reorder a second array in tandem, both by reference (two scalar-element arrays; flags/descending/multi-key are follow-ups)
- [x] Scalar indexed-array inputs for the hash-based functions (`array_replace`, `array_replace_recursive`, `array_diff_assoc`, `array_intersect_assoc`, `array_merge_recursive`) — converted to integer-keyed hashes via `__rt_array_to_hash`; result key/value widen to `Mixed` for heterogeneous (indexed + string-keyed) inputs so `foreach` dispatches keys correctly (string/heap element indexed inputs are a follow-up — they hit x86-specific converter/clone-shallow issues)
- [x] Hash-based builtins persist string values (instead of incref-sharing) when building results, so results own their string payloads independently of the source/temporary inputs (fixes a latent use-after-free when an input is freed before the result)
- [x] `array_merge_recursive()` string-scalar combine fix — combined-list string values are persisted (independent copies) instead of incref-shared, so the temporary wrappers can be released without corrupting the result
- Hash-based functions accept associative arrays and scalar-element indexed arrays; string/heap-element indexed inputs and the callback/sort functions' element-type limits are documented in `docs/php/arrays.md`

## v0.24.x — EIR introduction and register allocation

Introduce a domain-specific intermediate representation (EIR) between the
Expand Down
15 changes: 15 additions & 0 deletions docs/php/arrays.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,11 +152,17 @@ PHP does not allow keyed and unkeyed entries in the same destructuring pattern,
| `array_keys()` | `array_keys($arr): array` | Returns the array keys |
| `array_values()` | `array_values($arr): array` | Returns copy of values |
| `array_key_exists()` | `array_key_exists($key, $arr): bool` | Check if key exists |
| `array_key_first()` | `array_key_first($arr): int\|string\|null` | First key in insertion order, or `null` if the array is empty |
| `array_key_last()` | `array_key_last($arr): int\|string\|null` | Last key in insertion order, or `null` if the array is empty |
| `array_is_list()` | `array_is_list($arr): bool` | `true` if the keys are exactly `0..count-1` in order (the empty array is a list) |
| `array_search()` | `array_search($needle, $arr): int\|string\|false` | Search for value, returning an integer index for indexed arrays, the first matching associative-array key, or `false` if not found |
| `array_slice()` | `array_slice($arr, $offset [, $length]): array` | Extract a slice |
| `array_splice()` | `array_splice($arr, $offset [, $length]): array` | Remove a slice in place and return the removed elements |
| `array_chunk()` | `array_chunk($arr, $size): array` | Split into chunks |
| `array_merge()` | `array_merge($arr1, $arr2): array` | Merge two arrays |
| `array_merge_recursive()` | `array_merge_recursive($arr1, $arr2): array` | Recursively merge two arrays: integer keys append (renumbered), string keys that collide recurse when both values are arrays and otherwise combine into a list. Accepts associative arrays or **indexed arrays of scalars** (int/float/bool); nested indexed-array values are treated as opaque. |
| `array_replace()` | `array_replace($arr, $replacements): array` | Overwrite matching keys in `$arr` (in place, keeping position) and append new keys from `$replacements`; later values win. Accepts associative arrays or **indexed arrays of scalars** (int/float/bool). |
| `array_replace_recursive()` | `array_replace_recursive($arr, $replacements): array` | Like `array_replace()`, but when both values at a key are associative arrays they are merged recursively instead of overwritten. Accepts associative arrays or **indexed arrays of scalars** (int/float/bool); nested indexed arrays are overwritten, not merged. |
| `array_combine()` | `array_combine($keys, $values): array` | Create array from keys/values |
| `array_fill()` | `array_fill($start, $num, $value): array` | Fill with values |
| `array_fill_keys()` | `array_fill_keys($keys, $value): array` | Fill with values using keys |
Expand All @@ -166,6 +172,10 @@ PHP does not allow keyed and unkeyed entries in the same destructuring pattern,
| `array_intersect()` | `array_intersect($arr1, $arr2): array` | Values in both |
| `array_diff_key()` | `array_diff_key($arr1, $arr2): array` | Keys in $arr1 not in $arr2 |
| `array_intersect_key()` | `array_intersect_key($arr1, $arr2): array` | Keys in both |
| `array_udiff()` | `array_udiff($arr1, $arr2, $cmp): array` | Values in $arr1 not in $arr2, equality decided by the two-argument comparator (`$cmp($a, $b) === 0`). Supports string / function / non-capturing closure comparators. |
| `array_uintersect()` | `array_uintersect($arr1, $arr2, $cmp): array` | Values in both arrays, equality decided by the comparator (`$cmp($a, $b) === 0`). |
| `array_diff_assoc()` | `array_diff_assoc($arr1, $arr2): array` | Entries of $arr1 whose `(key, value)` pair is absent from $arr2 (values compared as `(string)$a === (string)$b`). Accepts associative arrays or **indexed arrays of scalars** (int/float/bool). |
| `array_intersect_assoc()` | `array_intersect_assoc($arr1, $arr2): array` | Entries of $arr1 whose `(key, value)` pair is present in $arr2 (values compared as strings). Accepts associative arrays or **indexed arrays of scalars** (int/float/bool). |
| `array_unique()` | `array_unique($arr): array` | Remove duplicates |
| `array_reverse()` | `array_reverse($arr): array` | Reverse order |
| `array_flip()` | `array_flip($arr): array` | Exchange keys and values, normalizing integer and numeric-string result keys |
Expand All @@ -185,6 +195,11 @@ PHP does not allow keyed and unkeyed entries in the same destructuring pattern,
| `shuffle()` | `shuffle($arr): void` | Randomly shuffle (in-place) |
| `array_rand()` | `array_rand($arr): int` | Pick one random key |
| `array_map()` | `array_map($callback, $arr): array` | Apply callback to each element |
| `array_walk_recursive()` | `array_walk_recursive($arr, $callback): void` | Apply `$callback` to each non-array leaf value, recursing into nested indexed/associative arrays. Leaf values must share a scalar type (consistent with `array_walk`: leaf passed by value, no key argument). |
| `array_multisort()` | `array_multisort($arr1, $arr2): bool` | Sort `$arr1` ascending (stable) and reorder `$arr2` in tandem; both are sorted in place (by reference). **Two indexed arrays of scalar elements**; sort flags, descending order, and >2 arrays are follow-ups. |
| `array_find()` | `array_find($arr, $callback): mixed` | (PHP 8.4) Returns the first element for which `$callback($value)` is truthy, or `null` if none match. |
| `array_any()` | `array_any($arr, $callback): bool` | (PHP 8.4) `true` if `$callback($value)` is truthy for at least one element. |
| `array_all()` | `array_all($arr, $callback): bool` | (PHP 8.4) `true` if `$callback($value)` is truthy for every element. |
| `array_filter()` | `array_filter($arr, $callback): array` | Filter where callback is truthy |
| `array_reduce()` | `array_reduce($arr, $callback, $init): int` | Reduce to single value |
| `array_walk()` | `array_walk($arr, $callback): void` | Call callback on each element |
Expand Down
38 changes: 38 additions & 0 deletions examples/assoc-arrays/main.php
Original file line number Diff line number Diff line change
Expand Up @@ -106,3 +106,41 @@
echo "\n";
}
echo "As JSON: " . json_encode($profile) . "\n";

// First and last keys, and list-shape detection
$ranking = ["gold" => 1, "silver" => 2, "bronze" => 3];
echo "\nFirst key: " . array_key_first($ranking) . "\n";
echo "Last key: " . array_key_last($ranking) . "\n";
echo "Ranking is a list? " . (array_is_list($ranking) ? "yes" : "no") . "\n";
echo "[10,20,30] is a list? " . (array_is_list([10, 20, 30]) ? "yes" : "no") . "\n";

// array_replace: later values win, keys keep their position
$config = ["host" => "localhost", "port" => 8080, "debug" => 0];
$patched = array_replace($config, ["port" => 9090, "debug" => 1]);
echo "\nPatched config:\n";
foreach ($patched as $key => $value) {
echo " " . $key . " = " . $value . "\n";
}

// array_merge_recursive: nested arrays merge instead of being overwritten
$a = ["limits" => ["cpu" => 1], "tags" => ["a" => 1]];
$b = ["limits" => ["mem" => 2], "tags" => ["b" => 2]];
$merged = array_merge_recursive($a, $b);
echo "\nRecursively merged limits:\n";
foreach ($merged["limits"] as $key => $value) {
echo " " . $key . " = " . $value . "\n";
}

// array_diff_assoc / array_intersect_assoc compare both key and value
$left = ["a" => 1, "b" => 2, "c" => 3];
$right = ["a" => 1, "b" => 9];
echo "\nDiff (kept from left): " . count(array_diff_assoc($left, $right)) . " entries\n";
echo "Intersect (in both): " . count(array_intersect_assoc($left, $right)) . " entries\n";

// The hash-based functions also accept plain indexed arrays of scalars: the
// indexed input is treated as an integer-keyed map (key 0, 1, 2, ...).
$levels = array_replace([10, 20, 30], [1 => 99]);
echo "\nPatched levels:\n";
foreach ($levels as $index => $level) {
echo " [" . $index . "] = " . $level . "\n";
}
118 changes: 118 additions & 0 deletions src/codegen/builtins/arrays/array_find_any_all.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
//! Purpose:
//! Emits PHP `array_find`, `array_any`, and `array_all` (PHP 8.4) predicate builtins.
//! Resolves the predicate callback and dispatches to the unified `__rt_array_find_any_all` helper.
//!
//! Called from:
//! - `crate::codegen::builtins::arrays::emit()`.
//!
//! Key details:
//! - Reuses the array-callback machinery for string and closure/function callbacks; a mode selects find/any/all.

use crate::codegen::abi;
use crate::codegen::context::Context;
use crate::codegen::data_section::DataSection;
use crate::codegen::emit::Emitter;
use crate::codegen::expr::emit_expr;
use crate::parser::ast::Expr;
use crate::types::PhpType;
use super::callback_env;
use super::runtime_string_callback;

/// Emits the PHP 8.4 `array_find` / `array_any` / `array_all` predicate builtins.
///
/// Evaluates the array (first arg), then resolves the predicate callback. The unified
/// runtime helper `__rt_array_find_any_all` receives `(callback, array, env, mode)` where
/// mode is `0` (find — returns the first matching element or `null`), `1` (any — boolean),
/// or `2` (all — boolean).
///
/// Supports string callbacks (`"is_positive"`) and closures / plain function callbacks,
/// covering the dominant predicate usage. Operates on indexed arrays with scalar elements
/// (consistent with `array_filter`).
///
/// # Returns
/// `Some(PhpType::Mixed)` for `array_find` (element or null), `Some(PhpType::Bool)` for
/// `array_any` / `array_all`.
pub fn emit(
name: &str,
args: &[Expr],
emitter: &mut Emitter,
ctx: &mut Context,
data: &mut DataSection,
) -> Option<PhpType> {
let mode: i64 = match name {
"array_any" => 1,
"array_all" => 2,
_ => 0,
};
let ret_ty = if name == "array_find" {
PhpType::Mixed
} else {
PhpType::Bool
};
emitter.comment(&format!("{}()", name));

let call_reg = abi::nested_call_reg(emitter);
let result_reg = abi::int_result_reg(emitter);
let cb_arg = abi::int_arg_reg_name(emitter.target, 0);
let arr_arg = abi::int_arg_reg_name(emitter.target, 1);
let env_arg = abi::int_arg_reg_name(emitter.target, 2);
let mode_arg = abi::int_arg_reg_name(emitter.target, 3);

// -- evaluate the array argument, then the callback (PHP source order) --
let arr_ty = emit_expr(&args[0], emitter, ctx, data);
let elem_ty = match &arr_ty {
PhpType::Array(elem) => elem.codegen_repr(),
_ => PhpType::Int,
};
abi::emit_push_reg(emitter, result_reg); // push the source array pointer onto the temporary stack

// -- string callback path ("is_positive") --
if runtime_string_callback::emit_after_saved_array(
&args[1],
Some(&arr_ty),
vec![elem_ty.clone()],
PhpType::Bool,
arr_arg,
emitter,
ctx,
data,
|wrapper, emitter, _ctx, _data| {
callback_env::load_env_slot_to_reg(emitter, arr_arg, wrapper.array_slot_offset);
abi::emit_symbol_address(emitter, cb_arg, &wrapper.wrapper_label);
callback_env::load_env_pointer_to_reg(emitter, env_arg);
abi::emit_load_int_immediate(emitter, mode_arg, mode);
abi::emit_call_label(emitter, "__rt_array_find_any_all");
},
) {
return Some(ret_ty);
}

// -- closure / plain function callback path --
let captures =
callback_env::materialize_callback_address(&args[1], call_reg, emitter, ctx, data);
if captures.is_empty() {
abi::emit_pop_reg(emitter, arr_arg); // pop the source array pointer into the array argument register
emitter.instruction(&format!("mov {}, {}", cb_arg, call_reg)); // move the callback function address into the callback argument register
abi::emit_load_int_immediate(emitter, env_arg, 0);
abi::emit_load_int_immediate(emitter, mode_arg, mode);
abi::emit_call_label(emitter, "__rt_array_find_any_all");
} else {
abi::emit_pop_reg(emitter, result_reg); // recover the source array pointer before building the capture environment
let wrapper = callback_env::emit_captured_callback_env(
call_reg,
result_reg,
&captures,
vec![elem_ty],
emitter,
ctx,
);
callback_env::load_env_slot_to_reg(emitter, arr_arg, wrapper.array_slot_offset);
abi::emit_symbol_address(emitter, cb_arg, &wrapper.wrapper_label);
callback_env::load_env_pointer_to_reg(emitter, env_arg);
abi::emit_load_int_immediate(emitter, mode_arg, mode);
abi::emit_call_label(emitter, "__rt_array_find_any_all");
abi::emit_release_temporary_stack(emitter, wrapper.env_bytes);
}

Some(ret_ty)
}
63 changes: 63 additions & 0 deletions src/codegen/builtins/arrays/array_is_list.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
//! Purpose:
//! Emits PHP `array_is_list` builtin calls.
//! Returns a boolean indicating whether an array has sequential 0..n-1 integer keys.
//!
//! Called from:
//! - `crate::codegen::builtins::arrays::emit()`.
//!
//! Key details:
//! - Indexed arrays are lists by construction; associative arrays and Mixed values defer to the runtime walk.

use crate::codegen::abi;
use crate::codegen::context::Context;
use crate::codegen::data_section::DataSection;
use crate::codegen::emit::Emitter;
use crate::codegen::expr::emit_expr;
use crate::codegen::platform::Arch;
use crate::parser::ast::Expr;
use crate::types::PhpType;

/// Emits code for the PHP `array_is_list` builtin.
///
/// `array_is_list($array)` returns `true` when the array's keys are exactly the
/// integers `0..count-1` in order (the empty array is a list), and `false`
/// otherwise.
///
/// # Codegen
/// - Evaluates `args[0]` into the container register.
/// - For a statically indexed `PhpType::Array`, loads the constant `1`: indexed
/// arrays always have sequential keys, so the runtime walk is skipped.
/// - For associative arrays and `Mixed` values, calls `__rt_array_is_list`, which
/// reads the heap kind, walks the hash insertion-order chain, and unwraps boxed
/// array payloads.
///
/// # Returns
/// `Some(PhpType::Bool)` — the list-shape predicate result in the integer result register.
pub fn emit(
_name: &str,
args: &[Expr],
emitter: &mut Emitter,
ctx: &mut Context,
data: &mut DataSection,
) -> Option<PhpType> {
emitter.comment("array_is_list()");
let arr_ty = emit_expr(&args[0], emitter, ctx, data);

if matches!(arr_ty, PhpType::Array(_)) {
if emitter.target.arch == Arch::X86_64 {
emitter.instruction("mov eax, 1"); // indexed arrays always have sequential 0..n-1 keys
} else {
emitter.instruction("mov x0, #1"); // indexed arrays always have sequential 0..n-1 keys
}
return Some(PhpType::Bool);
}

if emitter.target.arch == Arch::X86_64 {
emitter.instruction("mov rdi, rax"); // move the container pointer into the first x86_64 argument register
abi::emit_call_label(emitter, "__rt_array_is_list");
return Some(PhpType::Bool);
}

emitter.instruction("bl __rt_array_is_list"); // walk the hash insertion order to test list shape
Some(PhpType::Bool)
}
Loading