-
Notifications
You must be signed in to change notification settings - Fork 696
fix(stream): fix _changelog_row_id out of order #23240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes an issue with the _changelog_row_id
being generated out of order by implementing proper virtual node (vnode) aware row ID generation in the ChangeLog executor. The fix ensures that changelog row IDs are generated correctly using a new ChangelogRowIdGenerator
that maintains separate sequences for each vnode.
- Replaces the placeholder row ID generation with proper vnode-aware implementation
- Introduces a new
ChangelogRowIdGenerator
that uses the snowflake algorithm with per-vnode sequences - Removes the separate
StreamRowIdGen
step by integrating row ID generation directly into the ChangeLog executor
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
src/stream/src/from_proto/changelog.rs | Updates builder to pass vnode information to ChangeLog executor |
src/stream/src/executor/changelog.rs | Implements proper row ID generation using ChangelogRowIdGenerator with vnode awareness |
src/frontend/src/planner/changelog.rs | Adds vnode_count extraction from base table catalog |
src/frontend/src/optimizer/plan_node/stream_changelog.rs | Adds vnode_count to protobuf serialization |
src/frontend/src/optimizer/plan_node/logical_changelog.rs | Removes StreamRowIdGen usage and adds vnode_count parameter threading |
src/frontend/src/optimizer/plan_node/generic/changelog.rs | Adds vnode_count field to generic ChangeLog structure |
src/common/src/util/row_id.rs | Implements new ChangelogRowIdGenerator with per-vnode sequence management |
proto/stream_plan.proto | Adds vnode_count field to ChangeLogNode protobuf definition |
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Can you fill the PR body first? |
let stream_key = input | ||
.stream_key() | ||
.ok_or_else(|| anyhow::anyhow!("changelog input must have a stream key"))? | ||
.to_vec(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can assume the stream key must exist for stream plan node? Then we can call expect_stream_key
. cc @chenzl25
let (distribution_keys, new_input) = match dist { | ||
Distribution::HashShard(distribution_keys) | ||
| Distribution::UpstreamHashShard(distribution_keys, _) => { | ||
(distribution_keys.clone(), input) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can call dist_column_indices
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this method will panic, and it's not certain if there are any special cases here that could cause issues. This should be the same problem as the one above, cc @chenzl25
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I guess you can call enforce_concrete_distribution
then dist_column_indices
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the input of changelog is simply a table scan, so the distribution of scan could be UpstreamHashShard or Single
. But it seems we don't handle the Single
distribution here. For example the stream key of Single is empty, but we use it in HashShard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the input of changelog is simply a table scan, so the distribution of scan could be UpstreamHashShard or Single
True. But theoretically the upstream can be any plan node, so I'd suggest using a more general implementation here. 🤔
pub fn new_with_dist( | ||
core: generic::ChangeLog<PlanRef>, | ||
dist: Distribution, | ||
distribution_keys: Vec<u32>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can obtain this from core.input.distribution().dist_column_indices()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the comments above, the stream_key of our changelog operator should be changlog_row_id. Therefore, we fix the dist_key here as changlog_row_id instead of using the dist_key from the upstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surprised to see there are no planner tests for this feature. Could you add some in this PR?
Also, can you add the test case mentioned in #23177 as an e2e test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM. cc @chenzl25 Would you also take a look at the optimizer part before this getting into main?
let (distribution_keys, new_input) = match dist { | ||
Distribution::HashShard(distribution_keys) | ||
| Distribution::UpstreamHashShard(distribution_keys, _) => { | ||
(distribution_keys.clone(), input) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I guess you can call enforce_concrete_distribution
then dist_column_indices
.
I will take a look later today. |
Can you also add some planner test for the new plan? |
statement ok | ||
create table wide (v1 int primary key, even int, odd int); | ||
|
||
statement ok | ||
create sink even into wide (v1, even) as select v1, v1*2 from t4; | ||
|
||
statement ok | ||
create sink odd into wide (v1, odd) as select v1, v1*2+1 from t4; | ||
|
||
statement ok |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
columns mismatch
let (distribution_keys, new_input) = match dist { | ||
Distribution::HashShard(distribution_keys) | ||
| Distribution::UpstreamHashShard(distribution_keys, _) => { | ||
(distribution_keys.clone(), input) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the input of changelog is simply a table scan, so the distribution of scan could be UpstreamHashShard or Single
. But it seems we don't handle the Single
distribution here. For example the stream key of Single is empty, but we use it in HashShard.
Co-authored-by: Copilot <[email protected]>
let distribution_keys = match dist { | ||
Distribution::HashShard(distribution_keys) | ||
| Distribution::UpstreamHashShard(distribution_keys, _) => distribution_keys.clone(), | ||
_ => { | ||
vec![] | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please enumerate all the distribution type explicitly and I think Broadcast
and SomeShard
should never appear in the input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM
Co-authored-by: Copilot <[email protected]>
✅ Cherry-pick PRs (or issues if encountered conflicts) have been created successfully to all target branches. |
Co-authored-by: Copilot <[email protected]>
There is a compatibility issue in this PR. I created a MV with
Then I upgrade to this PR ad6e6ab and insert more rows
CN panics:
|
True. This is really an issue. Because we are adding a new field into the plan node proto. |
Co-authored-by: Copilot <[email protected]> Co-authored-by: Patrick Huang <[email protected]>
Co-authored-by: Xinhao Xu <[email protected]> Co-authored-by: Copilot <[email protected]> Co-authored-by: Chengyou Liu <[email protected]>
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
#23177
Because our changelog previously used the logic of rowid to generate vnodes in a loop, which caused errors. In this PR, the problem is avoided by using the vnode of the table.
This pull request introduces significant improvements to how changelog row IDs are generated and handled throughout the system, with a focus on per-vnode sequencing, distribution awareness, and better timestamp management. The changes affect the row ID generation logic, the optimizer and planner, and the stream executor, ensuring that changelog row IDs are unique per vnode and properly distributed. Below are the most important changes grouped by theme:
Row ID Generation Enhancements:
ChangelogRowIdGenerator
struct inrow_id.rs
that generates unique changelog row IDs with per-vnode sequence management and improved timestamp handling via a sharedTimestampManager
. This ensures uniqueness and correct ordering for distributed changelog processing.RowIdGenerator
to use the newTimestampManager
for consistent timestamp management, replacing direct timestamp logic and simplifying timestamp updates. [1] [2] [3] [4] [5]Stream Plan and Executor Integration:
ChangeLogExecutor
to use the newChangelogRowIdGenerator
, passing in the vnode bitmap and distribution keys, and generating row IDs per vnode for each chunk. Also handles vnode bitmap updates on barriers for dynamic partitioning. [1] [2] [3]StreamExchange
when necessary to enforce distribution. [1] [2]Protobuf and Plan Node Changes:
ChangeLogNode
protobuf definition to include a repeateddistribution_keys
field, and updatedStreamChangeLog
to carry and serialize these keys, ensuring distribution metadata is correctly propagated in the stream plan. [1] [2] [3] [4] [5]These changes collectively improve the reliability, scalability, and correctness of changelog row ID generation and distribution in the streaming system.
What's changed and what's your intention?
Checklist
Documentation
Release note