Skip to content

REP-6492 Switch to $sampleRate-style partitioning #128

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

FGasper
Copy link
Collaborator

@FGasper FGasper commented Aug 18, 2025

$sample-based partitioning has proven problematic for some years now because it often creates highly-imbalanced partitions.

This changeset switches partitioning to use $sampleRate instead. Because this entails a full index scan it tends to be slower; we offset that by creating partition tasks immediately as we receive sampled partition boundaries rather than all at once at the end of the aggregation.

Because MongoDB 4.2 lacked $sampleRate (and $rand as well), the legacy partitioning logic remains for use with that server version.

Both legacy & $sampleRate partitioning are made to use available read concern and secondaryPreferred read preference. These aggregations don’t need consistency, but they benefit substantially from speed & minimizing workload on the primary.

A few simplifications are made here as well. For example, MongosyncID is removed from the PartitionKey struct since it’s never actually relevant, and certain parameters to the legacy partitioner are made constant (since they were always used thus).

@FGasper FGasper force-pushed the REP-6492-samplerate-partition branch from 180b79a to ab5b493 Compare August 20, 2025 17:13
@FGasper FGasper requested a review from tdq45gj August 20, 2025 19:40
@FGasper FGasper marked this pull request as ready for review August 20, 2025 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant