Aikido

Why you should wrap array_filter() with array_values() in PHP

Bug Risk

Rule
Wrap array filtering results with array_values()
Functions like array_filter() preserve original keys,
which can cause bugs when code expects sequential
numeric indexes starting from 0.

Supported languages: PHP

Introduction

PHP's array_filter() preserves the original array keys when filtering elements, even for numerically indexed arrays. After filtering [0 => 'a', 1 => 'b', 2 => 'c'] to remove index 1, you get [0 => 'a', 2 => 'c'] with a gap in the numeric keys. Code that expects sequential indexes starting from 0 breaks when accessing $array[1] or iterating with assumptions about continuous indexes. This behavior surprises developers coming from other languages where filtering returns reindexed arrays.

Why it matters

Code maintainability: Non-sequential array keys create subtle bugs that only appear in specific conditions. Code that works fine with unfiltered arrays mysteriously fails after filtering when it assumes count($array) - 1 is the highest valid index. Debugging these issues wastes time because the root cause isn't obvious, you see an array with three elements but can't access the second one with index 1.

JSON encoding issues: When you json_encode() an array with non-sequential keys, PHP treats it as an object instead of an array, producing {"0":"a","2":"c"} instead of ["a","c"]. Frontend code expecting JSON arrays receives objects instead, breaking iteration and array methods. This mismatch between PHP arrays and JavaScript arrays causes integration bugs that only appear after filtering operations.

Iteration and pagination errors: Code that paginates results or splits arrays into chunks breaks when keys aren't sequential. Looping from 0 to count($array) accesses undefined indexes. Using array_slice() for pagination produces unexpected results because it operates on positions but returns original keys. These bugs compound in complex data processing pipelines.

Code examples

❌ Non-compliant:

function getActiveUsers(array $users): array {
    $activeUsers = array_filter($users, function($user) {
        return $user['status'] === 'active';
    });

    // Bug: assumes index 0 exists and keys are sequential
    $firstActive = $activeUsers[0] ?? null;

    // Bug: JSON encodes as object if keys aren't sequential
    return json_encode($activeUsers);
}

$users = [
    ['id' => 1, 'status' => 'inactive'],
    ['id' => 2, 'status' => 'active'],
    ['id' => 3, 'status' => 'active']
];

// Returns {"1":{"id":2...},"2":{"id":3...}} (object, not array)
getActiveUsers($users);

Why it's wrong: After filtering removes index 0, the array has keys [1, 2] instead of [0, 1], making $activeUsers[0] undefined. JSON encoding produces an object instead of an array because keys aren't sequential, breaking frontend code that expects arrays.

✅ Compliant:

function getActiveUsers(array $users): string {
    $activeUsers = array_filter($users, function($user) {
        return $user['status'] === 'active';
    });

    // Reindex to sequential keys starting from 0
    $activeUsers = array_values($activeUsers);

    // Now index 0 always exists for non-empty arrays
    $firstActive = $activeUsers[0] ?? null;

    // JSON encodes as proper array: [{"id":2...},{"id":3...}]
    return json_encode($activeUsers);
}

$users = [
    ['id' => 1, 'status' => 'inactive'],
    ['id' => 2, 'status' => 'active'],
    ['id' => 3, 'status' => 'active']
];

// Returns [{"id":2...},{"id":3...}] (array, as expected)
getActiveUsers($users);

Why this matters: array_values() reindexes the filtered array to sequential numeric keys starting from 0, making index access predictable and ensuring JSON encoding produces arrays instead of objects. The function behaves as expected regardless of which elements were filtered out.

Conclusion

Always wrap array_filter(), array_diff(), and similar functions with array_values() when you need sequential numeric indexes. This prevents subtle bugs from non-sequential keys and ensures JSON encoding produces arrays instead of objects. The performance cost is negligible compared to the debugging time saved.

FAQs

Got Questions?

Which PHP array functions preserve keys and need array_values()?

array_filter(), array_diff(), array_diff_key(), array_intersect(), and array_intersect_key() all preserve original keys. Functions like array_map() with a single array parameter also preserve keys. Any function documented as preserving keys should be wrapped with array_values() if you need sequential indexes.

Does array_values() have performance implications?

Minimal. array_values() iterates through the array once to reindex it, which is O(n) but very fast in practice. This cost is negligible compared to the filtering operation itself and far less than debugging production issues from non-sequential keys. Always prioritize correctness over premature optimization.

What about associative arrays with string keys?

Don't use array_values() on associative arrays where keys are meaningful strings. It would replace your string keys with numeric indexes, losing important information. This rule applies only to numerically indexed arrays where sequential keys starting from 0 are expected.

Can I use array_splice() or other functions instead?

array_splice() reindexes automatically, but it's meant for insertion/removal, not filtering. For filtering operations, array_filter() with array_values() is clearer and more idiomatic. Use the right tool for the job rather than abusing functions to get reindexing as a side effect.

How do I test for this issue in my codebase?

Look for array_filter() calls whose results are accessed by numeric index, encoded to JSON, or passed to functions expecting sequential keys. Test with data where filtered elements aren't at the end of the array. If removing the first or middle elements causes bugs, you need array_values().

Get secure now

Secure your code, cloud, and runtime in one central system.
Find and fix vulnerabilities fast automatically.

No credit card required | Scan results in 32secs.