-
Couldn't load subscription status.
- Fork 64
feat(hgraph): ensure that the removing is thread-safe. #1269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: jinjiabao.jjb <[email protected]>
Reviewer's GuideEnsure thread-safe removal in HGraph by guarding graph and label updates under a mutex, introduce a new concurrent remove feature flag, and add comprehensive concurrent add-search-remove tests; also refine a test constant declaration. Sequence diagram for thread-safe removal in HGraph::RemovesequenceDiagram
participant "Caller"
participant "HGraph"
participant "LabelTable"
participant "RouteGraph"
participant "bottom_graph"
participant "label_lookup_mutex_"
"Caller"->>"HGraph": Remove(id)
activate "HGraph"
"HGraph"->>"LabelTable": GetIdByLabel(id)
"HGraph"->>"label_lookup_mutex_": Acquire scoped_lock
"HGraph"->>"RouteGraph": DeleteNeighborsById(inner_id) (for each level)
"HGraph"->>"bottom_graph": DeleteNeighborsById(inner_id)
"HGraph"->>"LabelTable": Remove(id)
"HGraph"->>"HGraph": delete_count_++
"HGraph"->>"label_lookup_mutex_": Release scoped_lock
"HGraph"-->>"Caller": return true
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Summary of ChangesHello @inabao, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses the need for thread-safe removal operations within the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Warning Gemini encountered an error creating the review. You can try again by commenting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- Consider wrapping the temporary vsag::Options::block_size_limit changes in a RAII guard to ensure the original limit is always restored, even if a test throws.
- Right now only the neighbor-deletion block is locked in HGraph::Remove, but the entry_point handling and route_graphs_ pop_back happens outside the lock—consider moving the mutex to cover the entire Remove() method for full thread safety.
- TestConcurrentAddSearchRemove declares an expected_success parameter but never uses it; either implement failure scenarios or remove the parameter to keep the signature accurate.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider wrapping the temporary vsag::Options::block_size_limit changes in a RAII guard to ensure the original limit is always restored, even if a test throws.
- Right now only the neighbor-deletion block is locked in HGraph::Remove, but the entry_point handling and route_graphs_ pop_back happens outside the lock—consider moving the mutex to cover the entire Remove() method for full thread safety.
- TestConcurrentAddSearchRemove declares an expected_success parameter but never uses it; either implement failure scenarios or remove the parameter to keep the signature accurate.
## Individual Comments
### Comment 1
<location> `src/algorithm/hgraph.cpp:1641-1640` </location>
<code_context>
- for (int level = static_cast<int>(route_graphs_.size()) - 1; level >= 0; --level) {
- this->route_graphs_[level]->DeleteNeighborsById(inner_id);
+ {
+ std::scoped_lock label_lock(this->label_lookup_mutex_);
+ for (int level = static_cast<int>(route_graphs_.size()) - 1; level >= 0; --level) {
+ this->route_graphs_[level]->DeleteNeighborsById(inner_id);
+ }
</code_context>
<issue_to_address>
**issue (bug_risk):** Locking only label_lookup_mutex_ may not protect all shared resources.
Since route_graphs_, bottom_graph_, and label_table_ are also modified, review concurrent access to these objects and add appropriate locking if necessary to avoid data races.
</issue_to_address>
### Comment 2
<location> `tests/test_hgraph.cpp:1354-1355` </location>
<code_context>
+ auto index = TestIndex::TestFactory(test_index->name, param, true);
+ auto dataset = HGraphTestIndex::pool.GetDatasetAndCreate(
+ dim, resource->base_count, metric_type);
+ // Execute build test
+ TestIndex::TestConcurrentAddSearchRemove(index, dataset, search_param, true);
+ // Restore original block size limit
+ vsag::Options::Instance().set_block_size_limit(origin_size);
</code_context>
<issue_to_address>
**suggestion (testing):** Missing negative test cases for concurrent remove operations.
Add tests for failed concurrent remove operations, such as removing non-existent IDs, double-removal, and simultaneous removal of the same ID, to verify error handling and thread safety.
Suggested implementation:
```cpp
// Execute build test
TestIndex::TestConcurrentAddSearchRemove(index, dataset, search_param, true);
// Negative test cases for concurrent remove operations
{
// 1. Remove non-existent IDs
std::vector<size_t> non_existent_ids = {999999, 888888, 777777};
std::vector<std::future<bool>> futures;
for (auto id : non_existent_ids) {
futures.push_back(std::async(std::launch::async, [index, id]() {
return index->Remove(id); // Should fail
}));
}
for (auto& fut : futures) {
REQUIRE_FALSE(fut.get());
}
// 2. Double-removal of the same ID
size_t valid_id = 0;
REQUIRE(index->Remove(valid_id)); // First removal should succeed
REQUIRE_FALSE(index->Remove(valid_id)); // Second removal should fail
// 3. Simultaneous removal of the same ID
size_t concurrent_id = 1;
// Re-add the ID for this test if needed
index->Add(dataset->Get(concurrent_id), concurrent_id);
std::atomic<int> success_count{0};
std::vector<std::thread> threads;
for (int i = 0; i < 4; ++i) {
threads.emplace_back([&]() {
if (index->Remove(concurrent_id)) {
success_count++;
}
});
}
for (auto& t : threads) t.join();
// Only one thread should succeed in removing
REQUIRE(success_count == 1);
}
// Restore original block size limit
vsag::Options::Instance().set_block_size_limit(origin_size);
```
- If `Remove` and `Add` methods do not exist or have different signatures, you will need to adjust the calls accordingly.
- If your index implementation returns error codes or throws exceptions instead of boolean, update the assertions to match.
- If your dataset does not support `Get(id)`, replace with the correct method to retrieve an item by ID.
- If you use a custom thread pool or async framework, adapt the threading code to your conventions.
</issue_to_address>
### Comment 3
<location> `tests/test_index.cpp:2427-2436` </location>
<code_context>
+ fixtures::ThreadPool pool(5);
+ std::vector<std::future<bool>> futures;
+
+ auto func = [&](uint64_t i) -> bool {
+ auto data_one = vsag::Dataset::Make();
+ data_one->Dim(dim)
+ ->Ids(dataset->base_->GetIds() + i)
+ ->NumElements(1)
+ ->Paths(dataset->base_->GetPaths() + i)
+ ->Float32Vectors(dataset->base_->GetFloat32Vectors() + i * dim)
+ ->SparseVectors(dataset->base_->GetSparseVectors() + i)
+ ->Owner(false);
+ auto add_index = index->Add(data_one);
+ auto search_index = index->KnnSearch(data_one, 1, search_param);
+ auto remove_index = index->Remove(*(dataset->base_->GetIds() + i));
+ return add_index.has_value() & search_index.has_value() & remove_index.has_value();
+ };
+
</code_context>
<issue_to_address>
**nitpick:** Bitwise AND used for boolean logic in test assertions.
Replace '&' with '&&' to improve readability and prevent potential issues if the return types change from bool.
</issue_to_address>
### Comment 4
<location> `tests/test_index.cpp:2446-2447` </location>
<code_context>
+
+ for (auto& res : futures) {
+ auto val = res.get();
+ REQUIRE(val);
+ }
+}
+
</code_context>
<issue_to_address>
**suggestion (testing):** Test does not verify the actual state of the index after concurrent removals.
Please add assertions to confirm that removed elements cannot be found in the index after removal.
```suggestion
for (auto& res : futures) {
auto val = res.get();
REQUIRE(val);
}
// Verify that removed elements cannot be found in the index
for (uint64_t j = temp_count; j < base_count; ++j) {
auto id = *(dataset->base_->GetIds() + j);
auto search_result = index->SearchById(id);
REQUIRE(!search_result.has_value());
}
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| // Execute build test | ||
| TestIndex::TestConcurrentAddSearchRemove(index, dataset, search_param, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Missing negative test cases for concurrent remove operations.
Add tests for failed concurrent remove operations, such as removing non-existent IDs, double-removal, and simultaneous removal of the same ID, to verify error handling and thread safety.
Suggested implementation:
// Execute build test
TestIndex::TestConcurrentAddSearchRemove(index, dataset, search_param, true);
// Negative test cases for concurrent remove operations
{
// 1. Remove non-existent IDs
std::vector<size_t> non_existent_ids = {999999, 888888, 777777};
std::vector<std::future<bool>> futures;
for (auto id : non_existent_ids) {
futures.push_back(std::async(std::launch::async, [index, id]() {
return index->Remove(id); // Should fail
}));
}
for (auto& fut : futures) {
REQUIRE_FALSE(fut.get());
}
// 2. Double-removal of the same ID
size_t valid_id = 0;
REQUIRE(index->Remove(valid_id)); // First removal should succeed
REQUIRE_FALSE(index->Remove(valid_id)); // Second removal should fail
// 3. Simultaneous removal of the same ID
size_t concurrent_id = 1;
// Re-add the ID for this test if needed
index->Add(dataset->Get(concurrent_id), concurrent_id);
std::atomic<int> success_count{0};
std::vector<std::thread> threads;
for (int i = 0; i < 4; ++i) {
threads.emplace_back([&]() {
if (index->Remove(concurrent_id)) {
success_count++;
}
});
}
for (auto& t : threads) t.join();
// Only one thread should succeed in removing
REQUIRE(success_count == 1);
}
// Restore original block size limit
vsag::Options::Instance().set_block_size_limit(origin_size);
- If
RemoveandAddmethods do not exist or have different signatures, you will need to adjust the calls accordingly. - If your index implementation returns error codes or throws exceptions instead of boolean, update the assertions to match.
- If your dataset does not support
Get(id), replace with the correct method to retrieve an item by ID. - If you use a custom thread pool or async framework, adapt the threading code to your conventions.
| auto func = [&](uint64_t i) -> bool { | ||
| auto data_one = vsag::Dataset::Make(); | ||
| data_one->Dim(dim) | ||
| ->Ids(dataset->base_->GetIds() + i) | ||
| ->NumElements(1) | ||
| ->Paths(dataset->base_->GetPaths() + i) | ||
| ->Float32Vectors(dataset->base_->GetFloat32Vectors() + i * dim) | ||
| ->SparseVectors(dataset->base_->GetSparseVectors() + i) | ||
| ->Owner(false); | ||
| auto add_index = index->Add(data_one); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: Bitwise AND used for boolean logic in test assertions.
Replace '&' with '&&' to improve readability and prevent potential issues if the return types change from bool.
Codecov Report❌ Patch coverage is @@ Coverage Diff @@
## main #1269 +/- ##
==========================================
- Coverage 92.16% 92.08% -0.08%
==========================================
Files 314 315 +1
Lines 18930 17513 -1417
==========================================
- Hits 17446 16127 -1319
+ Misses 1484 1386 -98
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
Signed-off-by: jinjiabao.jjb <[email protected]>
Signed-off-by: jinjiabao.jjb <[email protected]>
Signed-off-by: jinjiabao.jjb <[email protected]>
Signed-off-by: jinjiabao.jjb <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
close: #1268
Summary by Sourcery
Enable thread-safe removal in HGraph and add end-to-end concurrent add, search, and delete testing
New Features:
Tests: