Iterative loop for ls recursive directory search #8728

kimono-koans · 2025-09-24T00:45:33Z

Would resolve: #8725

codspeed-hq · 2025-09-24T01:04:04Z

CodSpeed Performance Report

Merging #8728 will degrade performances by 5.24%

_{Comparing kimono-koans:recursive_loop (3e4b7eb) with main (32eef06)}

Summary

❌ 10 regressions
✅ 83 untouched
⏩ 1 skipped¹

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
❌	`ls_recursive_balanced_tree[(3, 4, 8)]`	436 µs	452.7 µs	-3.7%
❌	`ls_recursive_balanced_tree[(4, 3, 6)]`	428.7 µs	447.4 µs	-4.19%
❌	`ls_recursive_balanced_tree[(5, 2, 10)]`	435.1 µs	454.9 µs	-4.35%
❌	`ls_recursive_long_all_balanced_tree[(4, 3, 6)]`	514.5 µs	532.1 µs	-3.3%
❌	`ls_recursive_long_all_balanced_tree[(5, 2, 10)]`	525 µs	542.4 µs	-3.2%
❌	`ls_recursive_long_all_wide_tree[(1000, 200)]`	6.8 ms	7 ms	-3.53%
❌	`ls_recursive_long_all_wide_tree[(5000, 500)]`	32.5 ms	33.6 ms	-3.29%
❌	`ls_recursive_mixed_tree`	515.8 µs	527.9 µs	-2.3%
❌	`ls_recursive_wide_tree[(1000, 200)]`	5.3 ms	5.5 ms	-2.98%
❌	`ls_recursive_wide_tree[(10000, 1000)]`	48.4 ms	51.1 ms	-5.24%

1 benchmark was skipped, so the baseline result was used instead. If it was deleted from the codebase, click here and archive it to remove it from the performance reports. ↩

github-actions · 2025-09-24T01:06:33Z

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

github-actions · 2025-09-24T01:38:17Z

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

github-actions · 2025-09-24T02:30:49Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/stdbuf (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

github-actions · 2025-09-24T03:09:17Z

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

github-actions · 2025-09-24T04:12:14Z

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

github-actions · 2025-09-24T05:09:42Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/stdbuf (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

github-actions · 2025-09-24T08:06:04Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/stdbuf (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

src/uu/ls/src/ls.rs

…utils into recursive_loop

github-actions · 2025-09-24T16:34:16Z

GNU testsuite comparison:

GNU test failed: tests/tail/overlay-headers. tests/tail/overlay-headers is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)

github-actions · 2025-09-24T18:14:14Z

GNU testsuite comparison:

GNU test failed: tests/tail/overlay-headers. tests/tail/overlay-headers is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

…irectory" This reverts commit c6f2a7b.

github-actions · 2025-09-25T21:03:13Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

…utils into recursive_loop

github-actions · 2025-09-25T22:11:44Z

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

sylvestre · 2025-09-26T06:44:41Z

according to https://codspeed.io/uutils/coreutils/branches/kimono-koans%3Arecursive_loop
it regressed a bunch of benchmarks.
Up to 6%:

wdyt ?

github-actions · 2025-09-26T07:06:59Z

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/stdbuf (passes in this run but fails in the 'main' branch)

github-actions · 2025-09-26T09:08:31Z

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/stdbuf (passes in this run but fails in the 'main' branch)

sylvestre · 2025-09-26T09:13:15Z

i rerun the benchmark:
#8728 (comment)

kimono-koans · 2025-09-26T17:00:45Z

according to https://codspeed.io/uutils/coreutils/branches/kimono-koans%3Arecursive_loop it regressed a bunch of benchmarks. Up to 6%: wdyt ?

I'd say performance isn't the primary driver of this 1st/2nd/3rd step? Of course recursion can be faster than an iterative approach, especially in trivial/small cases (which I'll admit are the mostly the scenario for ls). Recursion, after all, uses stack space instead of the heap! The problem is usually -- the recursive method also never deallocates and can cause either a stack overflow or run away memory usage. It is also naturally single threaded.

And, as discussed, this is a step:

This is also the first step to concurrent directory search.

In other PRs, I've removed mutable state, removed the DirEntry (a >1K bloat) from the PathData struct, so now we can clone PathData and move PathData onto a new thread (!), such that concurrent search is at least possible. Even a rayon::join() is nearly possible now (to see why not, see the DirEntry in the PathData struct), whereby we simply spawn a rayon thread to get the next read_dir ready, while enter_directory runs.

Despite recursion being perhaps the best for the common case, we have to prepare for the worst case. It's why we use a HashMap instead of a BTreeMap, which would undoubtedly be the faster method in the common case. We have to anticipate Firefox cache directories with 700K items, etc.

These are all iterative steps (har har) which may provide more flexibility to refactor into something better in the future. For instance, below, you have to imagine the 2nd is more cache friendly than the 1st, simply because the loops are tighter, but we need to get rid of &mut state.out everywhere to make this happen.

for loc in locs {
    let path_data = PathData::new(PathBuf::from(loc), None, None, config, true);

    // Getting metadata here is no big deal as it's just the CWD
    // and we really just want to know if the strings exist as files/dirs
    //
    // Proper GNU handling is don't show if dereferenced symlink DNE
    // but only for the base dir, for a child dir show, and print ?s
    // in long format
    if path_data.get_metadata(&mut state.out).is_none() {
        continue;
    }

    let show_dir_contents = match path_data.file_type(&mut state.out) {
        Some(ft) => !config.directory && ft.is_dir(),
        None => {
            set_exit_code(1);
            false
        }
    };

    if show_dir_contents {
        dirs.push(path_data);
    } else {
        files.push(path_data);
    }
}

let (mut dirs, mut files): (Vec<PathData>, Vec<PathData>) = locs
    .into_iter()
    .map(|loc| PathData::new(PathBuf::from(loc), None, None, config, true))
    .filter(|path_data| {
        // Getting metadata here is no big deal as it's just the CWD
        // and we really just want to know if the strings exist as files/dirs
        //
        // Proper GNU handling is don't show if dereferenced symlink DNE
        // but only for the base dir, for a child dir show, and print ?s
        // in long format
        path_data.get_metadata().is_none()
    })
    .partition(|path_data| {
        let show_dir_contents = match path_data.file_type() {
            Some(ft) => !config.directory && ft.is_dir(),
            None => {
                set_exit_code(1);
                return false;
            }
        };

        show_dir_contents
    });

I'd also be very pleased to add all my PRs into a mega update, a mega update which shows measurable performance gains, if you would accept a mega update? Or any other approach the maintainers would prefer.

But somethings just aren't going to last, if this program ever wants to be faster, like mutable state everywhere and certain fat structs, etc. Someone is going to have to refactor them out sometime to get anywhere with ls.

…info, clone struct to avoid additional syscalls

…info, clone struct to avoid additional syscalls Fix lints

…ed such info, clone struct to avoid additional syscalls" This reverts commit e89886e.

…utils into recursive_loop

…ans/coreutils into recursive_loop" This reverts commit de1fd8d, reversing changes made to d0c7b82.

github-actions · 2025-09-28T07:11:30Z

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

…utils into recursive_loop

github-actions · 2025-09-28T07:41:08Z

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

kimono-koans added 3 commits September 23, 2025 19:26

Initial commit

753f8b0

Closer

3d3d9de

Get queue order correct

8393d5f

Use one contains, instead of insert and remove

b8bee00

kimono-koans added 2 commits September 23, 2025 21:02

No need for HashSet

fe259c4

Cleanup

7c2c27b

Merge branch 'main' into recursive_loop

fa68904

Reduce syscalls via retained metadata

da4e85d

kimono-koans added 4 commits September 23, 2025 23:29

Fix lints

aba3997

Fix lints

09cbb15

Cleanup

c1304de

Cleanup

a90d73c

Merge branch 'main' into recursive_loop

d096b8b

RenjiSann reviewed Sep 24, 2025

View reviewed changes

src/uu/ls/src/ls.rs Outdated Show resolved Hide resolved

kimono-koans added 2 commits September 24, 2025 11:12

Use "stack" instead of queue

11f1739

Merge branch 'recursive_loop' of https://github.com/kimono-koans/core…

05990a3

…utils into recursive_loop

kimono-koans added 4 commits September 24, 2025 12:19

More likely inode is different than device, allows early bailout

08e2e27

Use single mut vector instead of appending new vectors to stack

451b4a5

Fix bug re: recursive symlinks

fc70e39

Cleanup

78a80c2

kimono-koans added 6 commits September 25, 2025 22:41

GNU implementation does not error out upon a already listed directory

a9c22c3

Revert "GNU implementation does not error out upon a already listed d…

3e7e35f

…irectory" This reverts commit c6f2a7b.

Cleanup

08b45c7

Cleanup

0191fc3

Refactor

68443ea

Fix lints

d3ba40d

sylvestre force-pushed the recursive_loop branch from 42a3af8 to d3ba40d Compare September 25, 2025 20:41

Merge branch 'recursive_loop' of https://github.com/kimono-koans/core…

0c51069

…utils into recursive_loop

sylvestre closed this Sep 26, 2025

sylvestre reopened this Sep 26, 2025

Merge branch 'main' into recursive_loop

a55bf78

kimono-koans added 7 commits September 28, 2025 01:18

Reduce stat calls for additional info in modes which don't need such …

1dc2ad9

…info, clone struct to avoid additional syscalls

Fix lints

36fa836

Reduce stat calls for additional info in modes which don't need such …

e89886e

…info, clone struct to avoid additional syscalls Fix lints

Revert "Reduce stat calls for additional info in modes which don't ne…

d0c7b82

…ed such info, clone struct to avoid additional syscalls" This reverts commit e89886e.

Merge branch 'recursive_loop' of https://github.com/kimono-koans/core…

de1fd8d

…utils into recursive_loop

Revert "Merge branch 'recursive_loop' of https://github.com/kimono-ko…

f01b14b

…ans/coreutils into recursive_loop" This reverts commit de1fd8d, reversing changes made to d0c7b82.

Merge branch 'main' into recursive_loop

23a5e28

kimono-koans added 2 commits September 28, 2025 02:18

Avoid additional info generation

04e0266

Merge branch 'recursive_loop' of https://github.com/kimono-koans/core…

3e4b7eb

…utils into recursive_loop

kimono-koans mentioned this pull request Sep 28, 2025

ls: uutils locales support causing excess of statx and readlink syscalls #8761

Open

Uh oh!

Iterative loop for ls recursive directory search #8728

Are you sure you want to change the base?

Iterative loop for ls recursive directory search #8728

Conversation

kimono-koans commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq bot commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #8728 will degrade performances by 5.24%

Summary

Benchmarks breakdown

Footnotes

Uh oh!

github-actions bot commented Sep 24, 2025

Uh oh!

github-actions bot commented Sep 24, 2025

Uh oh!

github-actions bot commented Sep 24, 2025

Uh oh!

github-actions bot commented Sep 24, 2025

Uh oh!

github-actions bot commented Sep 24, 2025

Uh oh!

github-actions bot commented Sep 24, 2025

Uh oh!

github-actions bot commented Sep 24, 2025

Uh oh!

Uh oh!

github-actions bot commented Sep 24, 2025

Uh oh!

github-actions bot commented Sep 24, 2025

Uh oh!

github-actions bot commented Sep 25, 2025

Uh oh!

github-actions bot commented Sep 25, 2025

Uh oh!

sylvestre commented Sep 26, 2025

Uh oh!

github-actions bot commented Sep 26, 2025

Uh oh!

github-actions bot commented Sep 26, 2025

Uh oh!

sylvestre commented Sep 26, 2025

Uh oh!

kimono-koans commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 28, 2025

Uh oh!

github-actions bot commented Sep 28, 2025

Uh oh!

Uh oh!

kimono-koans commented Sep 24, 2025 •

edited

Loading

codspeed-hq bot commented Sep 24, 2025 •

edited

Loading

kimono-koans commented Sep 26, 2025 •

edited

Loading