perf(engine): return sorted data from compute_trie_input #19340

yongkangc · 2025-10-28T04:15:25Z

Closes #19249

Eliminates sorting overhead per block by returning TrieInputSorted instead of unsorted TrieInput from compute_trie_input, and have ExecutedBlock utilise the TrieInputSorted

Previously, MultiProofConfig::from_input() would call drain_into_sorted() on both nodes and state every block, performing expensive sorting operations:

This eliminates 2-5ms of sorting overhead per block by returning TrieInputSorted instead of unsorted TrieInput from compute_trie_input. Previously, MultiProofConfig::from_input would call drain_into_sorted() on both nodes and state, performing expensive sorting operations every block. Now compute_trie_input sorts once at the end and returns sorted data, making MultiProofConfig::from_input a simple Arc wrapper. Changes: - Add TrieInputSorted type with sorted TrieUpdates and HashedPostState - Add clear() methods to TrieUpdatesSorted and HashedPostStateSorted - Update compute_trie_input to return (TrieInputSorted, BlockNumber) - Update MultiProofConfig::from_input to accept TrieInputSorted - Update BasicEngineValidator to store Option<TrieInputSorted> The implementation uses a "build unsorted, sort once" strategy: unsorted HashMap-based structures are used during building for fast extend operations, then sorted once before returning. This eliminates redundant sorting while maintaining performance. Resolves: #19249

Updated the handling of trie input in the compute_trie_input function to improve performance and memory efficiency. The changes include: - Replaced the use of Option<TrieInputSorted> with Option<TrieInput> to allow for better reuse of allocated capacity. - Introduced a new method, drain_into_sorted, in TrieInput to convert it into TrieInputSorted while retaining HashMap capacity for subsequent operations. - Adjusted the logic in compute_trie_input to utilize the new method, reducing unnecessary allocations and improving performance during block validations. These modifications streamline the trie input processing, enhancing overall efficiency in the engine's validation workflow.

crates/trie/common/src/input.rs

crates/engine/tree/src/tree/payload_validator.rs

crates/trie/common/src/input.rs

Replaced the existing HashedPostState and TrieUpdates with their sorted counterparts, HashedPostStateSorted and TrieUpdatesSorted, in the ExecutedBlock struct. This change enhances the efficiency of state handling by ensuring that the trie updates and hashed state are maintained in a sorted order, improving performance during block execution and validation.

- The previous approach: - Converts every sorted block back into hash maps (cloning all keys/values once per block) because extend_with_blocks works on the unsorted representation. - After all that, drain_into_sorted() iterates those hash maps, builds sorted Vecs, and drains the allocations—more cloning and shuffling before we return to the same sorted layout we could have maintained from the start. - So the new loop cuts out the conversion overhead and reduces allocations; the old code was strictly more work for the same end result.

Your changes improve performance by: 1. **Avoiding Costly Allocations:** Instead of creating and merging many temporary `HashMap`s, the code now builds the final `HashMap` directly from sorted lists (`Vec`s). This drastically reduces memory allocation overhead. 2. **Faster CPU Operations:** Iterating over a `Vec` is more cache-friendly and faster for the CPU than iterating over a `HashMap`. Additionally, you removed a redundant lookup (`.remove()`) from a critical loop, saving extra CPU cycles.

crates/engine/tree/src/tree/payload_validator.rs

crates/trie/common/src/input.rs

- Since TrieInputSorted already has the Arc-wrapped nodes and state, we can use them directly without creating an intermediate wrapper struct. After eliminating both call sites, MultiProofConfig became dead code, so we removed it.

Fixed syntax error in test_hashed_storage_extend_from_sorted_wiped that was preventing compilation and formatting.

mattsse

all of this lgtm

although I haven't reviewed all the order+extend changes line by line

mattsse · 2025-11-06T12:30:27Z

crates/engine/tree/src/tree/payload_validator.rs

+            input.state = Arc::clone(&first.hashed_state);
+            input.nodes = Arc::clone(&first.trie_updates);
+
+            // Only clone and mutate if there are multiple in-memory blocks.


ah this makes sense

crates/engine/tree/src/tree/payload_validator.rs

crates/optimism/payload/src/builder.rs

Use `.peekable()` instead of checking `blocks.len() > 1` for more idiomatic iterator usage. This makes the code clearer by checking if there are more items after consuming the first one, rather than mixing iterator consumption with slice length checks.

yongkangc · 2025-11-07T05:50:24Z

@mediocregopher @mattsse addressed your feedbacks 👍🏻

mediocregopher

One little question but overall LGTM, will need rebasing with #19430 merged

crates/engine/tree/src/tree/payload_validator.rs

Add a dedicated `append_ref` method to `TrieInputSorted` that accepts `&HashedPostStateSorted` directly, encapsulating the logic of extending prefix sets and state. This makes the code cleaner and more maintainable by moving the conversion outside and calling a dedicated method. This addresses mediocregopher's review comment to have `append_ref` accept sorted state directly, allowing for better encapsulation and potential future optimizations.

Add construct_prefix_sets() to HashedPostStateSorted and construct_prefix_set() to HashedStorageSorted to support the append_ref method on TrieInputSorted. These methods efficiently iterate over already-sorted data to build prefix sets needed for trie computation, without requiring any sorting operations. Also fix BuiltPayloadExecutedBlock conversion to correctly use Either::Right for sorted variants and convert unsorted to sorted when needed.

This reverts commit cc643da.

This reverts commit af643c3.

Modify the handling of hashed state and trie updates in BuiltPayloadExecutedBlock to keep them unsorted until conversion is necessary. This change ensures that the conversion to sorted form occurs only when required, improving efficiency and clarity in the codebase.

Eliminates redundant filter().count() pre-pass in both TrieUpdates::extend_from_sorted and StorageTrieUpdates::extend_from_sorted. Instead, use sorted.account_nodes.len() / sorted.storage_nodes.len() directly for capacity reservation. The previous implementation iterated twice: once to count filtered entries, then again to insert them. The new approach reserves using the full vector length, resulting in slight over-allocation but avoiding the O(n) pre-pass entirely. The insertion loop still filters correctly, ensuring behavioral correctness.

yongkangc added C-perf A change motivated by improving speed, memory usage or disk footprint A-engine Related to the engine implementation labels Oct 28, 2025

yongkangc requested a review from Rjected as a code owner October 28, 2025 04:15

github-project-automation bot added this to Reth Tracker Oct 28, 2025

yongkangc requested review from fgimenez, mattsse, mediocregopher and shekhirin as code owners October 28, 2025 04:15

github-project-automation bot moved this to Backlog in Reth Tracker Oct 28, 2025

yongkangc marked this pull request as draft October 28, 2025 04:21

yongkangc self-assigned this Oct 28, 2025

yongkangc moved this from Backlog to In Progress in Reth Tracker Oct 28, 2025

yongkangc added 3 commits October 28, 2025 15:41

update call site

3fd4996

fix fmt

9b83edd

yongkangc added the A-trie Related to Merkle Patricia Trie implementation label Oct 29, 2025

yongkangc commented Oct 29, 2025

View reviewed changes

crates/trie/common/src/input.rs Show resolved Hide resolved

yongkangc commented Oct 29, 2025

View reviewed changes

crates/trie/common/src/input.rs Outdated Show resolved Hide resolved

yongkangc commented Oct 29, 2025

View reviewed changes

crates/engine/tree/src/tree/payload_validator.rs Outdated Show resolved Hide resolved

yongkangc commented Oct 29, 2025

View reviewed changes

crates/engine/tree/src/tree/payload_validator.rs Outdated Show resolved Hide resolved

yongkangc commented Oct 29, 2025

View reviewed changes

crates/trie/common/src/input.rs Outdated Show resolved Hide resolved

yongkangc added 5 commits October 29, 2025 15:21

added sorted for test

4249c66

hashed_state convert to sorted

23a9338

mediocregopher reviewed Oct 29, 2025

View reviewed changes

yongkangc added 2 commits October 30, 2025 09:03

refactor(engine): ExecutedBlock to Arc

87642e4

fix: add missing closing brace in hashed_state test

789a59b

Fixed syntax error in test_hashed_storage_extend_from_sorted_wiped that was preventing compilation and formatting.

yongkangc force-pushed the yk/compute_trie2 branch from 45ba086 to 789a59b Compare November 3, 2025 08:12

mattsse reviewed Nov 6, 2025

View reviewed changes

mediocregopher reviewed Nov 7, 2025

View reviewed changes

crates/engine/tree/src/tree/payload_validator.rs Show resolved Hide resolved

yongkangc requested review from DaniPopes, gakonst and klkvr as code owners November 7, 2025 12:34

yongkangc force-pushed the yk/compute_trie2 branch from 0fab8d5 to af643c3 Compare November 7, 2025 12:41

yongkangc added 8 commits November 7, 2025 20:49

Merge branch 'main' into yk/compute_trie2

39ac52d

Revert "feat: add construct_prefix_sets methods for sorted state types"

2914dbb

This reverts commit cc643da.

Revert "refactor: add append_ref method to TrieInputSorted"

dbef533

This reverts commit af643c3.

Merge branch 'main' into yk/compute_trie2

baa3361

merge fixes

48721b9

merge

dd4f233

yongkangc force-pushed the yk/compute_trie2 branch from 437d39f to dd4f233 Compare November 8, 2025 13:38

yongkangc added 4 commits November 9, 2025 10:51

Merge branch 'main' into yk/compute_trie2

6d1aa15

Merge branch 'main' into yk/compute_trie2

360e36e

Merge branch 'main' into yk/compute_trie2

5f63216

mediocregopher approved these changes Nov 19, 2025

View reviewed changes

mediocregopher added this pull request to the merge queue Nov 19, 2025

Merged via the queue into main with commit e58aa09 Nov 19, 2025
42 checks passed

mediocregopher deleted the yk/compute_trie2 branch November 19, 2025 16:16

github-project-automation bot moved this from In Progress to Done in Reth Tracker Nov 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(engine): return sorted data from compute_trie_input #19340

perf(engine): return sorted data from compute_trie_input #19340

Uh oh!

yongkangc commented Oct 28, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mattsse left a comment

Uh oh!

mattsse Nov 6, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yongkangc commented Nov 7, 2025

Uh oh!

mediocregopher left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

perf(engine): return sorted data from compute_trie_input #19340

perf(engine): return sorted data from compute_trie_input #19340

Uh oh!

Conversation

yongkangc commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mattsse left a comment

Choose a reason for hiding this comment

Uh oh!

mattsse Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yongkangc commented Nov 7, 2025

Uh oh!

mediocregopher left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yongkangc commented Oct 28, 2025 •

edited

Loading