Remove extra lookups and memory allocations from tsort graph construction #8694

Nekrolm · 2025-09-21T16:10:54Z

Hi.
I was benchmarking several tools and found there some low hanging minor optimizations for input processing that can be done:

Don't lookup in hashmap twice
Don't allocate extra arrays of input strings

sylvestre · 2025-09-21T16:14:48Z

did you look at the actual perf wins ?
(with hyperfine - previous rust impl, gnu and this one)

github-actions · 2025-09-21T16:35:00Z

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

sylvestre · 2025-09-21T16:39:39Z

a job is failing, you need to do
cd fuzz && cargo build
to update Cargo.lock (yes, it is silly)

Nekrolm · 2025-09-21T17:41:39Z

It optimizes only initial input processing, so any gain can be visible only on large files. Where the initial read is somehow noticeable

With 1 million edges linear chain (edges shuffled)

Benchmark 1: target/release/coreutils tsort line_shuf.txt || true
  Time (mean ± σ):     19.849 s ±  1.410 s    [User: 17.447 s, System: 2.365 s]
  Range (min … max):   17.963 s … 21.943 s    10 runs

After

Benchmark 1: target/release/coreutils tsort line_shuf.txt || true
  Time (mean ± σ):     19.577 s ±  0.652 s    [User: 17.293 s, System: 2.262 s]
  Range (min … max):   18.920 s … 20.719 s    10 runs

Actually, Itertools are not needed. And it's even better without them -- internal chunks' buffer overuses RefCell runtime checks
I've updated PR without it:

No itertools

Benchmark 1: target/release/coreutils tsort line_shuf.txt || true
  Time (mean ± σ):     18.866 s ±  0.453 s    [User: 16.725 s, System: 2.137 s]
  Range (min … max):   18.289 s … 19.867 s    10 runs

tsort (GNU coreutils) 9.4 -- (have't checked latest one)

dmis@dmis-asus-N7600PC:~/WORKSPACE/coreutils$ hyperfine "tsort line_shuf.txt || true"
Benchmark 1: tsort line_shuf.txt || true
  Time (mean ± σ):     28.474 s ±  4.092 s    [User: 27.882 s, System: 0.498 s]
  Range (min … max):   25.950 s … 39.779 s    10 runs

sylvestre · 2025-09-21T17:44:09Z

please run hyperfine this way:
hyperfine "tsort line_shuf.txt || true" " target/release/coreutils tsort line_shuf.txt || true"
it work better this way to compare implementations

codspeed-hq · 2025-09-21T17:54:47Z

CodSpeed Performance Report

Merging #8694 will not alter performance

_{Comparing Nekrolm:main (4f09383) with main (7dbeb8f)}

Summary

✅ 55 untouched
⏩ 1 skipped¹

1 benchmark was skipped, so the baseline result was used instead. If it was deleted from the codebase, click here and archive it to remove it from the performance reports. ↩

Nekrolm · 2025-09-21T18:19:40Z

dmis@dmis-asus-N7600PC:~/WORKSPACE/coreutils$  hyperfine "tsort line_shuf.txt || true" " target/release/coreutils tsort line_shuf.txt || true"
Benchmark 1: tsort line_shuf.txt || true
  Time (mean ± σ):     33.722 s ±  3.816 s    [User: 33.083 s, System: 0.620 s]
  Range (min … max):   27.813 s … 40.324 s    10 runs
 
Benchmark 2:  target/release/coreutils tsort line_shuf.txt || true
  Time (mean ± σ):     21.824 s ±  2.108 s    [User: 19.456 s, System: 2.356 s]
  Range (min … max):   18.467 s … 25.517 s    10 runs
 
Summary
   target/release/coreutils tsort line_shuf.txt || true ran
    1.55 ± 0.23 times faster than tsort line_shuf.txt || true

dmis@dmis-asus-N7600PC:~/WORKSPACE/coreutils$ cat /proc/cpuinfo 
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 140
model name      : 11th Gen Intel(R) Core(TM) i7-11370H @ 3.30GHz
stepping        : 1
microcode       : 0xbc
cpu MHz         : 843.122
cache size      : 12288 KB
physical id     : 0
siblings        : 8
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 27
wp              : yes

sylvestre · 2025-09-21T18:22:31Z

well done!

github-actions · 2025-09-21T18:28:12Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/stdbuf (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)

sylvestre · 2025-09-22T06:48:54Z

src/uu/tsort/src/tsort.rs

-            _ => return Err(TsortError::NumTokensOdd(input.to_string_lossy().to_string()).into()),
-        }
+
+    let mut edge_tokens = data.split_whitespace();


maybe document this a bit more to explain what it is doing ;)

Expanded a bit

github-actions · 2025-09-22T07:49:57Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/stdbuf (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)

github-actions · 2025-09-22T22:09:52Z

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

github-actions · 2025-09-24T14:17:10Z

GNU testsuite comparison:

GNU test failed: tests/tail/overlay-headers. tests/tail/overlay-headers is passing on 'main'. Maybe you have to rebase?
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

github-actions · 2025-09-24T19:41:37Z

GNU testsuite comparison:

GNU test failed: tests/tail/overlay-headers. tests/tail/overlay-headers is passing on 'main'. Maybe you have to rebase?

github-actions · 2025-09-24T21:05:22Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

…tion

Nekrolm force-pushed the main branch from 14d452f to 9dc387c Compare September 21, 2025 17:40

Nekrolm force-pushed the main branch from 29ce45b to e35431a Compare September 21, 2025 17:51

Nekrolm force-pushed the main branch from e35431a to 54851fe Compare September 21, 2025 18:02

sylvestre reviewed Sep 22, 2025

View reviewed changes

Nekrolm force-pushed the main branch from 54851fe to bccffee Compare September 22, 2025 07:27

sylvestre force-pushed the main branch from bccffee to 0962a1a Compare September 22, 2025 21:45

Nekrolm force-pushed the main branch from e3faa2a to ac7b145 Compare September 24, 2025 18:51

Nekrolm mentioned this pull request Sep 24, 2025

tsort: use iterative dfs to prevent stack overflows #8737

Open

sylvestre force-pushed the main branch from ac7b145 to 4f09383 Compare September 24, 2025 20:42

Nekrolm added 2 commits September 28, 2025 21:35

Remove extra lookups and memory allocations from tsort graph construc…

c862fc0

…tion

Remove extra lookups and memory allocations from tsort graph construc…

9db21a9

…tion

sylvestre force-pushed the main branch from 4f09383 to 9db21a9 Compare September 28, 2025 19:35

sylvestre merged commit ec7e88f into uutils:main Sep 28, 2025

Uh oh!

Remove extra lookups and memory allocations from tsort graph construction #8694

Remove extra lookups and memory allocations from tsort graph construction #8694

Conversation

Nekrolm commented Sep 21, 2025

Uh oh!

sylvestre commented Sep 21, 2025

Uh oh!

github-actions bot commented Sep 21, 2025

Uh oh!

sylvestre commented Sep 21, 2025

Uh oh!

Nekrolm commented Sep 21, 2025

Uh oh!

sylvestre commented Sep 21, 2025

Uh oh!

codspeed-hq bot commented Sep 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #8694 will not alter performance

Summary

Footnotes

Uh oh!

Nekrolm commented Sep 21, 2025

Uh oh!

sylvestre commented Sep 21, 2025

Uh oh!

github-actions bot commented Sep 21, 2025

Uh oh!

sylvestre Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

Nekrolm Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 22, 2025

Uh oh!

github-actions bot commented Sep 22, 2025

Uh oh!

github-actions bot commented Sep 24, 2025

Uh oh!

github-actions bot commented Sep 24, 2025

Uh oh!

github-actions bot commented Sep 24, 2025

Uh oh!

Uh oh!

codspeed-hq bot commented Sep 21, 2025 •

edited

Loading