-
Notifications
You must be signed in to change notification settings - Fork 1
feat: graceful shutdown #150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for refactoring some parts of the code base as well! Left a couple of small inline comments
l1_message.insert(self.get_connection()).await?; | ||
models::l1_message::Entity::insert(l1_message) | ||
.on_conflict( | ||
OnConflict::column(models::l1_message::Column::QueueIndex).do_nothing().to_owned(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we have an edge case where we index an L1 message, shutdown, the L1 reorgs, and we don't update it because of the do_nothing
directive?
@@ -511,3 +532,18 @@ fn delayed_interval(interval: u64) -> Interval { | |||
interval.set_missed_tick_behavior(tokio::time::MissedTickBehavior::Delay); | |||
interval | |||
} | |||
|
|||
/// Computes the starting block for the watcher from the database. | |||
pub async fn compute_watcher_start_block_from_database( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this live here?
.await? | ||
.expect("Batch info must be present due to database query arguments"); | ||
let block = self.get_highest_block_for_batch(batch.hash).await?; | ||
tracing::info!(target:"test", "{:?}", block); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover from debug
Overview
This PR implements a graceful shutdown for the rollup node to ensure that critical in-flight tasks are driven to completion before shutdown.
Test strategy
The test implemented for this feature is to start up a rollup node externally to the EN, such that we can manually poll it. This is required such that we can simulate a crash/shutdown midway through the processing of a batch. We simulate a crash midway through processing and then ensure that the node can start up and reprocess the latest batch that we crashed at. We assert that the EN safe block progresses to the expected chain tip upon restart.
Note: There may be an edge case in which the finalized head is greater than the safe head on restart, as we revert to the safe head. As such, when we issue the forkchoice rule, we ensure that we only issue an update if the safe block hash is greater than the finalized block hash.
Implementation
The following modifications have been made to support this feature:
ConsolidationOutcome
type to clean up the consolidation interfacerun_until_graceful_shutdown
function on the RNM and integrate with the rethTaskExecutor
ScrollRollupConfig