-
Notifications
You must be signed in to change notification settings - Fork 3.6k
branch-3.1: [bugfix](compaction) Fix the issue where input rowsets are prematurely evicted after compaction, causing query failures #55382 #55966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: branch-3.1
Are you sure you want to change the base?
Conversation
…y evicted after compaction, causing query failures (#55382) ### What problem does this PR solve? Problem Summary: 1. Problem background `There is a critical bug in Doris's compaction: after input rowsets participate in compaction, their expiration time calculation incorrectly uses the rowset's creation time (creation_time), instead of the compaction completion time` 2. Scene for example: a. After compaction is completed, the rowset should be discarded after another tablet_rowset_stale_sweep_time_sec b. Due to the use of creation time calculation, rowset is immediately eliminated c. The executing query failed, error occurred : [E-230]fail to find path in version_graph. spec_version: 0-1789 versions are already compacted 3. Cause a. In the current implementation, TimestampedVersion is created using rs->creation_time() b. Elimination judgment logic : `rowset_creation_time <= (current_time - tablet_rowset_stale_sweep_time_sec)` c. For earlier created rowsets, even if they have just participated in compaction, they will be immediately discarded due to their long creation time ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
run buildall |
TPC-H: Total hot run time: 32784 ms
|
TPC-DS: Total hot run time: 193192 ms
|
ClickBench: Total hot run time: 28.43 s
|
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
1 similar comment
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Cherry-picked from #55382