This repository was archived by the owner on Oct 10, 2025. It is now read-only.
Fix OverflowFile checkpoint corruption when no data is written #6046
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Fixes #6045
Fixes a critical bug where
OverflowFile::checkpoint()
unconditionally allocated a header page even when no data had been written, causingPrimaryKeyIndexStorageInfo
corruption and database reopen failures.Problem
When creating a VectorIndex without inserting any data, the database checkpoint completes successfully but corrupts the metadata. Reopening the database fails with an assertion error in
hash_index.cpp:487
:Minimal Reproduction
Root Cause
In
src/storage/overflow_file.cpp:236
,OverflowFile::checkpoint()
was unconditionally allocating a page even when no data had been written:Sequence of events:
PrimaryKeyIndex
(for STRING primary key)PrimaryKeyIndex
creates anOverflowFile
(for strings >12 bytes) withheaderPageIdx = INVALID_PAGE_IDX
OverflowFile::checkpoint()
allocates a page unnecessarilyPrimaryKeyIndexStorageInfo.overflowHeaderPage = 1
(should beINVALID_PAGE_IDX
)Solution
Skip checkpoint when
headerChanged == false
, following the same design pattern asNodeTable::checkpoint()
andRelTable::checkpoint()
:The
headerChanged
flag is only set totrue
when actual string data (>12 bytes) is written viaOverflowFileHandle::setStringOverflow()
.Benefits
Testing
Added comprehensive test suite in
test/storage/overflow_file_checkpoint_test.cpp
with 5 test cases:InMemOverflowFileAlwaysAllocatesHeader
- Verifies in-memory behaviorShortStringsDoNotTriggerOverflow
- Verifies strings ≤12 bytes are inlinedLongStringsDoTriggerOverflow
- Verifies strings >12 bytes use overflowEmptyOverflowFileHeaderNotChanged
- Documents the core bug fixVectorIndexCreationSequence
- Documents the bug scenarioAll tests pass:
Files Changed
src/storage/overflow_file.cpp
- Added early return whenheaderChanged == false
test/storage/CMakeLists.txt
- Added new test targettest/storage/overflow_file_checkpoint_test.cpp
- New test fileImpact
This fix resolves crashes when: