Skip to content

Conversation

@tbenr
Copy link
Contributor

@tbenr tbenr commented Nov 21, 2025

This an alternative to DASCustodySync.
The class is just introduced and not activated yet. Flags and additional DB variable will be added in subsequent PRs.

implements:
1- backfill of custody columns after checkpoint sync
2- at startup, it will check that our current recent blocks have all the custody (to mitigate custody columns not written on disk within the same block transaction)
3- when custody increases (due to increase in stacked ETH) it will make sure it will restart backfill and download the required additional columns

Documentation

  • I thought about documentation and added the doc-change-required label to this PR if updates are required.

Changelog

  • I thought about adding a changelog entry, and added one if I deemed necessary.

Note

Adds a new DasCustodyBackfiller service to backfill custody columns (with batching, startup head checks, and resync on custody changes) and updates/extends tests to validate behavior.

  • Ethereum/statetransition (DAS custody):
    • New DasCustodyBackfiller service: Backfills custody data columns with batched processing, startup heads custody check, cursor management, and resync when custody group count increases.
    • Cancels pending non-canonical requests on finalized checkpoint updates; retrieves/stores missing DataColumnSidecars via retriever/custody components.
  • Tests:
    • Unit tests: Add DasCustodyBackfillerTest covering scheduling, cursor init/movement, head-check batch, missing column retrieval, gaps with no blocks, cancellation on reorg, and completion at min custody slot.
    • Acceptance: Update DasCheckpointSyncAcceptanceTest to wait for custody backfill; adjust disabled reason.
  • Test fixtures:
    • Add TekuBeaconNode.waitForCustodyBackfill(UInt64, int) helper used by acceptance test.

Written by Cursor Bugbot for commit 2163bea. This will update automatically on new commits. Configure here.

@tbenr tbenr force-pushed the DASCustodyBackfiller branch 3 times, most recently from 6185c8e to 942263b Compare November 28, 2025 08:56
@tbenr tbenr force-pushed the DASCustodyBackfiller branch from 942263b to 684b9fe Compare December 2, 2025 10:24
@zilm13 zilm13 mentioned this pull request Dec 2, 2025
@tbenr tbenr marked this pull request as ready for review December 2, 2025 19:44
__ -> {
if (slot.isLessThanOrEqualTo(batchData.minCustodyPeriodSlot)) {
custodyGroupCountManager.setCustodyGroupSyncedCount(
batchData.requiredColumnsInCustody.size());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Column count passed instead of group count

The code calls setCustodyGroupSyncedCount(batchData.requiredColumnsInCustody.size()) where requiredColumnsInCustody is the list of column indices from getCustodyColumnIndices(). However, setCustodyGroupSyncedCount expects a custody group count, not a column count. Since each custody group can contain multiple columns (computed as NUMBER_OF_COLUMNS / NUMBER_OF_CUSTODY_GROUPS), these values differ. The method name, parameter name, and metric description ("custody_groups_backfilled") all indicate a group count is expected, not column count.

Fix in Cursor Fix in Web

__ -> {
if (slot.isLessThanOrEqualTo(batchData.minCustodyPeriodSlot)) {
custodyGroupCountManager.setCustodyGroupSyncedCount(
batchData.requiredColumnsInCustody.size());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Column count passed instead of group count to setter

The code passes batchData.requiredColumnsInCustody.size() (the column count) to setCustodyGroupSyncedCount, which expects the custody group count. Since requiredColumnsInCustody is populated from getCustodyColumnIndices() and each custody group contains multiple columns, the column count is typically larger than the group count. This will incorrectly set the synced group count to a higher value than intended, potentially preventing future custody re-sync operations from triggering when they should (since onGroupCountUpdate compares against currentSyncCustodyGroupCount, which gets its initial value from getCustodyGroupSyncedCount).

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant