Add stale reads to spanner #1639

n-h-diaz · 2025-10-17T20:54:34Z

Timestamps are read from the IngestionHistory table, but are cached for the next 5 seconds in memory to avoid a large increase in traffic. I can adjust the expiry as needed, depending on how frequent we expect the incremental data to be refreshed + how fresh we want the data

Since the running spanner instance (dc_graph_2025_09_15) doesn't have incremental ingestion, the timestamp will eventually go stale (it's manually set), so I added a temporary fallback to strong reads. Once we switch to incremental ingestion, we should make this check stronger and revisit if we want these queries to actually fail, since becoming stale indicates that something went wrong in ingestion. (I set the version retention to the max of 7 days)

Separating out some fixes from #1639 to unblock while the stale reads are still in discussion (The failing tests were making it harder to evaluate performance, so it'd be helpful to get the fixes in sooner) Includes * Sorting observation query test results for determinism in tests * Adding "distinct" to all chaining queries to avoid duplicates. "Match any" is not sufficient if there are multiple paths between nodes

n-h-diaz · 2025-10-21T18:17:15Z

Going to rethink this a bit after some discussion with Vishal

tl;dr - having an in memory cache could cause inconsistency across shards

vish-cs

Thanks for making the changes!

vish-cs · 2025-10-28T06:41:29Z

internal/server/spanner/client.go


+const (
+	// CACHE_DURATION defines how long the CompletionTimestamp is kept in memory before being refetched.
+	CACHE_DURATION = 5 * time.Second


nit: can we rename this to TIMESTAMP_CACHE_DURATION just to be explicit

vish-cs · 2025-10-28T06:44:04Z

internal/server/spanner/query.go

 	withStruct func(interface{}),
 ) error {
-	iter := sc.client.Single().Query(ctx, stmt)
+	timestampBound, err := sc.GetStalenessTimestampBound(ctx)


Do we want a feature flag to disable stale reads?

vish-cs · 2025-10-28T06:47:12Z

internal/server/spanner/query.go

+		return nil, err
+	}
+
+	timestampBound := spanner.ReadTimestamp(*completionTimestamp)


Curious what's the difference b/w completion timestamp and timestampBound?

vish-cs · 2025-10-28T06:49:34Z

internal/server/spanner/query.go


+	err = sc.processRows(iter, newStruct, withStruct)
+
+	// Check if the error is due to an expired timestamp (FAILED_PRECONDITION).


Curious when can that happen...max 7 day timeout?

vish-cs · 2025-10-28T06:50:39Z

internal/server/spanner/query.go

+// It prioritizes returning a value from an in-memory cache to reduce Spanner traffic.
+func (sc *SpannerClient) getCompletionTimestamp(ctx context.Context) (*time.Time, error) {
+	// Check cache
+	sc.cacheMutex.RLock()


As discussed, we need to think through how consistency would be ensured across caches in different mixer instances.

n-h-diaz and others added 5 commits October 17, 2025 12:10

Add stale reads to spanner

96e5ff8

fixes

22a6823

fixes

1fc76e2

test

c7ae7c3

Merge branch 'master' into e

37975fd

n-h-diaz marked this pull request as ready for review October 17, 2025 20:58

n-h-diaz requested review from hqpho, keyurva and vish-cs October 17, 2025 20:59

n-h-diaz mentioned this pull request Oct 21, 2025

Spanner fixes #1642

Merged

Merge branch 'master' into e

ed74c14

n-h-diaz marked this pull request as draft October 21, 2025 18:16

n-h-diaz removed request for hqpho, keyurva and vish-cs October 21, 2025 18:16

Merge branch 'master' into e

930ab6f

vish-cs reviewed Oct 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add stale reads to spanner #1639

Add stale reads to spanner #1639

n-h-diaz commented Oct 17, 2025 •

edited

Loading

Uh oh!

n-h-diaz commented Oct 21, 2025

Uh oh!

vish-cs left a comment

Uh oh!

vish-cs Oct 28, 2025

Uh oh!

vish-cs Oct 28, 2025

Uh oh!

vish-cs Oct 28, 2025

Uh oh!

vish-cs Oct 28, 2025

Uh oh!

vish-cs Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		err = sc.processRows(iter, newStruct, withStruct)

		// Check if the error is due to an expired timestamp (FAILED_PRECONDITION).

Add stale reads to spanner #1639

Are you sure you want to change the base?

Add stale reads to spanner #1639

Conversation

n-h-diaz commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

n-h-diaz commented Oct 21, 2025

Uh oh!

vish-cs left a comment

Choose a reason for hiding this comment

Uh oh!

vish-cs Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

vish-cs Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

vish-cs Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

vish-cs Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

vish-cs Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

n-h-diaz commented Oct 17, 2025 •

edited

Loading