-
Notifications
You must be signed in to change notification settings - Fork 3
feat: chain orchestrator #185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a new ChainOrchestrator
to replace the previous indexer, integrates it throughout the node, watcher, network, and engine, and updates tests and database migrations accordingly.
- Introduces
ChainOrchestrator
in place ofIndexer
, refactorsRollupNodeManager
to consume orchestrator events instead of indexer events. - Adds
Synced
notifications toL1Watcher
and updates engine driver to handle optimistic sync viaChainOrchestrator
. - Refactors configuration (
ScrollRollupNodeConfig
), network manager, and database migrations; adjusts tests to cover the new orchestrator flows.
Reviewed Changes
Copilot reviewed 40 out of 41 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
crates/indexer/src/lib.rs | Rename Indexer to ChainOrchestrator and overhaul API flows |
crates/manager/src/manager/mod.rs | Replace indexer usage with ChainOrchestrator in node manager |
crates/node/src/args.rs | Instantiate ChainOrchestrator in ScrollRollupNodeConfig |
crates/watcher/src/lib.rs | Add Synced variant and is_synced flag to L1Watcher |
crates/scroll-wire/src/protocol/proto.rs | Adjust doc comment for NewBlock::new |
crates/node/tests/e2e.rs | Add/revise reorg and sync end-to-end tests |
crates/watcher/tests/reorg.rs | Update tests to skip Synced notifications |
crates/database/db/src/operations.rs | Extend DB ops with L1MessageStart and block-and-batch queries |
crates/database/migration/src/migration_info.rs | Add genesis_hash() to migrations and insert genesis blocks |
crates/network/src/manager.rs | Wire up eth-wire listener and dispatch chain-orchestrator events |
crates/engine/src/driver.rs | Support ChainImport and OptimisticSync futures in engine driver |
Comments suppressed due to low confidence (2)
crates/scroll-wire/src/protocol/proto.rs:33
- The doc comment uses "blocks" (plural) but the constructor takes a single block; change to "block" for accuracy.
/// Returns a [`NewBlock`] instance with the provided signature and blocks.
crates/node/tests/e2e.rs:95
- The
follower_can_reorg
test has no assertions; either add meaningful checks or remove the empty test to maintain coverage.
async fn follower_can_reorg() -> eyre::Result<()> {
@@ -91,15 +91,19 @@ async fn test_should_detect_reorg() -> eyre::Result<()> { | |||
continue | |||
} | |||
|
|||
// skip the `L1Notification::Synced` notifications | |||
let mut notification = l1_watcher.recv().await.unwrap(); | |||
if matches!(notification.as_ref(), L1Notification::Synced) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only skips one Synced
notification; consider looping (e.g. while matches!(...) { ... }
) to skip all consecutive Synced
messages.
if matches!(notification.as_ref(), L1Notification::Synced) { | |
while matches!(notification.as_ref(), L1Notification::Synced) { |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of comments and some small nits and leftover code to clean.
@@ -33,7 +36,7 @@ impl IndexerItem { | |||
} | |||
} | |||
|
|||
/// The metrics for the [`super::Indexer`]. | |||
/// The metrics for the [`super::ChainOrchestrator`]. | |||
#[derive(Metrics, Clone)] | |||
#[metrics(scope = "indexer")] | |||
pub struct IndexerMetrics { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ChainOrchestratorMetrics
let mut received_chain_headers = vec![received_block.header.clone()]; | ||
let mut received_header_tail = received_block.header.clone(); | ||
|
||
// We should never have a re-org that is deeper than the current safe head. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why up to the safe head and not the finalized head?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This becomes a protocol design question. This assumes that L1 is the source of truth and batches posted to L1 should never be reorged. If so, then we would never expect a reorg deeper than the safe head. If there is a L1 reorg or a batch revert, then the safe head would change and deeper reorg would be allowed. If we allowed safe blocks to be reorged then a rogue sequencer could override batches posted to L1, would we want to allow this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no you are right. I'm wondering if "safe" should be committed on L1 + L1 finalized, because currently "safe" in DB is just committed on L1 I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is worthy of open discourse amongst the team. In my opinion, safe should mean that it is highly probable that the block will be settled at some later time. The only way a block associated with a batch posted to L1 would not be settled is if the batch was reorged (and not included later), and given that reorgs are relatively infrequent and shallow, I think the current approach is reasonable. Furthermore, I think we can consider a block as finalized if the batch associated with the block is included in a finalzied block. Provided we work under the assumption of completeness of the proof system and permissionless submission of proofs then I think we can mark these blocks as described before any proofs are posted to L1. Lets open this discussion for the team more broadly.
crates/chain-orchestrator/src/lib.rs
Outdated
received_header_tail = header; | ||
} else { | ||
return Err(ChainOrchestratorError::MissingBlockHeader { | ||
hash: current_chain_headers.front().unwrap().parent_hash, | ||
}); | ||
} | ||
} | ||
} | ||
|
||
// We search the in-memory chain to see if we can reconcile the block import. | ||
if let Some(pos) = current_chain_headers | ||
.iter() | ||
.rposition(|h| h.hash_slow() == received_header_tail.parent_hash) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand this part of the code: you fetch 50 blocks from the L2 client, starting at the received block. At the end of the fetch phase, received_header_tail = received_block - 50
and current_chain_headers
contains blocks from received_block
to received_block - 50
. But would the above branch ever work then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a bug here, we shouldn't be updating received_header_tail = header
. Let me update this and add some test cases for deep reorgs.
signer_args: Default::default(), | ||
}; | ||
|
||
// Create the chain spec for scroll dev with Euclid v2 activated and a test genesis. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
// Create the chain spec for scroll dev with Euclid v2 activated and a test genesis. | |
// Create the chain spec for scroll dev with Feynman activated and a test genesis. |
@@ -40,7 +40,7 @@ async fn can_build_blocks() { | |||
reth_tracing::init_test_tracing(); | |||
|
|||
const BLOCK_BUILDING_DURATION: Duration = Duration::from_millis(0); | |||
const BLOCK_GAP_TRIGGER: u64 = 100; | |||
// const BLOCK_GAP_TRIGGER: u64 = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove?
@@ -284,7 +281,7 @@ async fn can_build_blocks_with_finalized_l1_messages() { | |||
|
|||
let chain_spec = SCROLL_DEV.clone(); | |||
const BLOCK_BUILDING_DURATION: Duration = tokio::time::Duration::from_millis(0); | |||
const BLOCK_GAP_TRIGGER: u64 = 100; | |||
// const BLOCK_GAP_TRIGGER: u64 = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove?
@@ -671,7 +674,7 @@ async fn can_build_blocks_and_exit_at_time_limit() { | |||
let chain_spec = SCROLL_DEV.clone(); | |||
const MIN_TRANSACTION_GAS_COST: u64 = 21_000; | |||
const BLOCK_BUILDING_DURATION: Duration = Duration::from_secs(1); | |||
const BLOCK_GAP_TRIGGER: u64 = 100; | |||
// const BLOCK_GAP_TRIGGER: u64 = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove?
if self.is_synced() { | ||
if self.is_synced { | ||
tokio::time::sleep(SLOW_SYNC_INTERVAL).await; | ||
} else if self.current_block_number == self.l1_state.head { | ||
// if we have synced to the head of the L1, notify the channel and set the | ||
// `is_synced`` flag. | ||
if let Err(L1WatcherError::SendError(_)) = self.notify(L1Notification::Synced).await | ||
{ | ||
tracing::warn!(target: "scroll::watcher", "L1 watcher channel closed, stopping the watcher"); | ||
break; | ||
} | ||
self.is_synced = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the current logic suggests the watcher can never transition from is_synced = true -> false
. Is this expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. In the context of the RN, Synced
should mean that we have synced all L1 messages required to validate messages included in unsafe L2 blocks. Given that we only include L1 messages after the corresponding L1 block has been finalized I think this should be fine provided the watcher doesn't start to lag > 2 epochs behind the safe tip then the Synced
status should still remain valid. What do you think about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm but so if we lose a provider for 12 minutes we might enter an edge case we can't exit from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point and given that we have had recent experiences of the L1 provider being down for longer than 12 minutes I think we should cover this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couple of extra comments and questions. Also I see we have a lot more unwrap
s in the code, are all these safe to keep?
// Reverse the new chain headers to have them in the correct order. | ||
received_chain_headers.reverse(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally think we would gain in code clarity if received_chain_headers and current_chain_headers were ordered in the same way.
let consolidated = if !*optimistic_mode.lock().await { | ||
true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't perform any consolidation here. Is this because in !optimistic_mode
, the consolidation happens in handle_new_block
?
// Purge all pending block imports. | ||
self.chain_imports.clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we purge all pending block imports?
/// Handles a new block received from the eth-wire protocol. | ||
fn handle_eth_wire_block( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense to have this here 👌
// Handle blocks received from the eth-wire protocol. | ||
while let Some(Poll::Ready(Some(block))) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note (not relevant for now but lets keep it in mind): at some point we might want to implement the same logic as in Reth to avoid having some components in the node monopolizing all resources.
@@ -599,7 +598,7 @@ async fn can_build_blocks_and_exit_at_gas_limit() { | |||
let chain_spec = SCROLL_DEV.clone(); | |||
const MIN_TRANSACTION_GAS_COST: u64 = 21_000; | |||
const BLOCK_BUILDING_DURATION: Duration = Duration::from_millis(250); | |||
const BLOCK_GAP_TRIGGER: u64 = 100; | |||
// const BLOCK_GAP_TRIGGER: u64 = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove?
closes: #182