Refactor consensus-execution sync to push sync data #3538

Tristan-Wilson · 2025-08-26T20:37:38Z

This change eliminates a circular dependency between the consensus and
execution layers by transforming the sync status flow from a pull-based
to a push-based model. Previously, the execution layer would query the
consensus layer for sync status through the ConsensusInfo interface,
creating a tight coupling between the layers.

The new architecture introduces a ConsensusSyncData structure that
contains sync status, target message count, and progress information.
The ConsensusExecutionSyncer now periodically pushes this data from
consensus to execution, where it's stored using an atomic pointer for
lock-free reads. This approach maintains consistency with the existing
finality data push mechanism and provides better performance through
reduced lock contention.

As part of this refactoring, the ConsensusInfo interface has been
simplified to only include the BlockMetadataAtMessageIndex method,
removing the now-redundant Synced, FullSyncProgressMap, and
SyncTargetMessageCount methods. This cleaner separation of concerns
better supports alternative client implementations by clearly defining
the data flow boundaries between consensus and execution layers.

Fixes NIT-3649

This change eliminates a circular dependency between the consensus and execution layers by transforming the sync status flow from a pull-based to a push-based model. Previously, the execution layer would query the consensus layer for sync status through the ConsensusInfo interface, creating a tight coupling between the layers. The new architecture introduces a ConsensusSyncData structure that contains sync status, target message count, and progress information. The ConsensusExecutionSyncer now periodically pushes this data from consensus to execution, where it's stored using an atomic pointer for lock-free reads. This approach maintains consistency with the existing finality data push mechanism and provides better performance through reduced lock contention. As part of this refactoring, the ConsensusInfo interface has been simplified to only include the BlockMetadataAtMessageIndex method, removing the now-redundant Synced, FullSyncProgressMap, and SyncTargetMessageCount methods. This cleaner separation of concerns better supports alternative client implementations by clearly defining the data flow boundaries between consensus and execution layers.

Also lower the sync interval for tests.

tsahee · 2025-08-26T21:27:37Z

arbnode/consensus_execution_syncer.go

+	syncData := &execution.ConsensusSyncData{
+		Synced:                 c.syncMonitor.Synced(),
+		SyncTargetMessageCount: c.syncMonitor.SyncTargetMessageCount(),
+		SyncProgressMap:        c.syncMonitor.FullSyncProgressMap(),


SyncProgressMap should be nil or empty if Synced (prevents wasteful locking/etc)

Addressed in latest commit.

tsahee · 2025-08-26T21:48:00Z

execution/gethexec/sync_monitor.go

-	synced, err := s.consensus.Synced().Await(ctx)
-	if err != nil {
-		log.Error("Error checking if consensus is synced", "err", err)
+	data := s.consensusSyncData.Load()


This should be more complex..
If you're pushing from consensus there is already a delay from the time the message you pushed until the time it is read.
I think:

ConsensusSyncData should not include SyncTarget, but "maxMessageCount" from consensus (synTarget is a d delayed maxMessageCount).

execution side should have a config of MsgLag

execution side should have it's own TargetMessage. Ideally, this would be the last MaxMessage it got from consensus more then MsgLag ago - but no more then 2MsgLAg ago (if no data arrived in the last 2MsgLAg the target doesn't matter - it's not in sync. If it has data only from within the last MsgLag - it should use the least recent one there)

execution side should say it's in sync only if:
** last syncStatus from consensus is at no more then MsgLag old
** last syncStatus from consensus says consensus is synced
** execution met the internal TargetMessage

Addressed in latest commit.

…nc-info

Only populate SyncProgressMap when not synced. MaxMessageCount is now a dedicated field that's always sent. Fix stale sync targets caused by push delay. Instead of consensus sending pre-calculated targets, it now sends raw MaxMessageCount. Execution maintains a sliding window history and calculates its own target using values from 1-2 MsgLag ago, properly accounting for the push delay. The default push interval and execution message lag are both 1 second so they work together well. Includes unit tests for the sliding window implementation.

codecov · 2025-09-05T13:55:44Z

Codecov Report

❌ Patch coverage is 91.01124% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 22.80%. Comparing base (cb86fca) to head (4756bfd).
⚠️ Report is 14 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #3538      +/-   ##
==========================================
+ Coverage   22.70%   22.80%   +0.09%     
==========================================
  Files         388      388              
  Lines       58900    59016     +116     
==========================================
+ Hits        13375    13456      +81     
- Misses      43486    43517      +31     
- Partials     2039     2043       +4

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

tsahee · 2025-09-22T22:36:22Z

execution/gethexec/sync_monitor.go

+	windowEnd := now.Add(-h.msgLag)
+
+	for _, entry := range h.entries {
+		if !entry.timestamp.Before(windowStart) && !entry.timestamp.After(windowEnd) {


When I look at the current description of msgLag - I think what we actually want is the oldest message that's less then MsgLag old (2*MsgLag is not relevant).
We can discard anything that's more then MsgLag old.
We can do other things with different documentation for MsgLag - but this method (which is different from what I said before) seems to fit current documentation and be simple enough.

Updated it to get the oldest message that is newer than msgLag old.

tsahee · 2025-09-22T22:43:23Z

execution/gethexec/sync_monitor.go

+
+	// Add the max message count to history for sync target calculation
+	if syncData != nil && syncData.MaxMessageCount > 0 {
+		s.syncHistory.add(syncData.MaxMessageCount, syncData.UpdatedAt)


I think we should use the minimum of (syncData.UpdatedAt, time.Now()) for time, so if times between components don't match we at least know timestamp is not in the future.

Good idea, I set time.Now to be the floor

tsahee · 2025-09-22T22:44:39Z

execution/gethexec/sync_monitor.go

 	}

+	// Always add the max message count
+	res["maxMessageCount"] = data.MaxMessageCount


this comes from consensus so let's call it "consensusMaxMessageCount" or something"

Changed it to "consensusMaxMessageCount"

tsahee · 2025-09-22T22:46:55Z

execution/gethexec/sync_monitor.go

+	defer h.mutex.RUnlock()
+
+	if len(h.entries) == 0 {
+		return 0


not entirely certain - but I think in this case it's better to return an error and not 0, to make sure nothing makes the mistake to think we're in sync

I think it's okay because we return not synced if it's zero:

func (s *SyncMonitor) Synced(ctx context.Context) bool { ... // Calculate the sync target based on historical data syncTarget := s.syncHistory.getSyncTarget(now) if syncTarget == 0 { // No valid sync target available yet return false } ...

- Simplify sync target to use oldest entry < msgLag ago (not 2*msgLag window) - Use min(now, syncData.UpdatedAt) to prevent future timestamps - Rename maxMessageCount to consensusMaxMessageCount for clarity - Update tests to match new msgLag-based trimming behavior

…nc-info Fix minor conflict around using pflag instead of flag.

Tristan-Wilson · 2025-09-25T05:52:37Z

Tests are passing now after merging in latest master, assigning back to Tsahi for review.

Tristan-Wilson added 2 commits August 26, 2025 10:16

Start the ConsensusExecutionSyncer for l2-only mode

8418a7d

Also lower the sync interval for tests.

Tristan-Wilson requested review from diegoximenes and tsahee August 26, 2025 20:37

Tristan-Wilson assigned diegoximenes Aug 26, 2025

Tristan-Wilson mentioned this pull request Aug 26, 2025

Don't broadcast old messages to the feed #3526

Open

tsahee requested changes Aug 26, 2025

View reviewed changes

diegoximenes assigned Tristan-Wilson and unassigned diegoximenes Aug 27, 2025

Tristan-Wilson and others added 3 commits August 29, 2025 14:25

Merge branch 'master' into consensus-pushes-sync-info

11ef4b7

Merge remote-tracking branch 'origin/master' into consensus-pushes-sy…

07dd219

…nc-info

Tristan-Wilson requested a review from tsahee August 29, 2025 17:47

Tristan-Wilson assigned tsahee and unassigned Tristan-Wilson Sep 1, 2025

Tristan-Wilson and others added 3 commits September 4, 2025 13:45

Add SyncMonitor default to ConfigDefault

cfc1913

Merge branch 'master' into consensus-pushes-sync-info

5630184

Sleep to give time for ConsensusExecutionSyncer

4756bfd

tsahee added the after-next-version This PR shouldn't be merged until after the next version is released label Sep 8, 2025

tsahee requested changes Sep 22, 2025

View reviewed changes

tsahee removed the after-next-version This PR shouldn't be merged until after the next version is released label Sep 22, 2025

tsahee assigned Tristan-Wilson and unassigned tsahee Sep 22, 2025

Tristan-Wilson and others added 4 commits September 23, 2025 12:51

Merge remote-tracking branch 'origin/master' into consensus-pushes-sy…

47c85ff

…nc-info Fix minor conflict around using pflag instead of flag.

Merge branch 'master' into consensus-pushes-sync-info

82de959

Merge branch 'master' into consensus-pushes-sync-info

8258a39

Tristan-Wilson assigned tsahee and unassigned Tristan-Wilson Sep 25, 2025

Tristan-Wilson requested a review from tsahee September 26, 2025 05:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor consensus-execution sync to push sync data #3538

Refactor consensus-execution sync to push sync data #3538

Uh oh!

Tristan-Wilson commented Aug 26, 2025

Uh oh!

tsahee Aug 26, 2025

Uh oh!

Tristan-Wilson Aug 29, 2025

Uh oh!

tsahee Aug 26, 2025

Uh oh!

Tristan-Wilson Aug 29, 2025

Uh oh!

codecov bot commented Sep 5, 2025 •

edited

Loading

Uh oh!

tsahee Sep 22, 2025

Uh oh!

Tristan-Wilson Sep 23, 2025

Uh oh!

tsahee Sep 22, 2025

Uh oh!

Tristan-Wilson Sep 23, 2025

Uh oh!

tsahee Sep 22, 2025

Uh oh!

Tristan-Wilson Sep 23, 2025

Uh oh!

tsahee Sep 22, 2025

Uh oh!

Tristan-Wilson Sep 23, 2025

Uh oh!

Tristan-Wilson commented Sep 25, 2025

Uh oh!

Uh oh!

Refactor consensus-execution sync to push sync data #3538

Are you sure you want to change the base?

Refactor consensus-execution sync to push sync data #3538

Uh oh!

Conversation

Tristan-Wilson commented Aug 26, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Tristan-Wilson commented Sep 25, 2025

Uh oh!

Uh oh!

codecov bot commented Sep 5, 2025 •

edited

Loading