Skip to content

Conversation

@inabao
Copy link
Collaborator

@inabao inabao commented Oct 22, 2025

close: #1268

Summary by Sourcery

Enable thread-safe removal in HGraph and add end-to-end concurrent add, search, and delete testing

New Features:

  • Introduce SUPPORT_ADD_SEARCH_DELETE_CONCURRENT feature flag in HGraph
  • Guard remove operations with a mutex to make HGraph deletion thread-safe

Tests:

  • Add TestHGraphConcurrentAddSearchRemove with PR and daily test cases
  • Implement TestIndex::TestConcurrentAddSearchRemove for concurrent add, search, and remove validation
  • Switch search parameter literal in brute force tests to a constexpr static string

@inabao inabao self-assigned this Oct 22, 2025
@inabao inabao added kind/feature New feature or request version/0.18 labels Oct 22, 2025
@sourcery-ai
Copy link

sourcery-ai bot commented Oct 22, 2025

Reviewer's Guide

Ensure thread-safe removal in HGraph by guarding graph and label updates under a mutex, introduce a new concurrent remove feature flag, and add comprehensive concurrent add-search-remove tests; also refine a test constant declaration.

Sequence diagram for thread-safe removal in HGraph::Remove

sequenceDiagram
participant "Caller"
participant "HGraph"
participant "LabelTable"
participant "RouteGraph"
participant "bottom_graph"
participant "label_lookup_mutex_"

"Caller"->>"HGraph": Remove(id)
activate "HGraph"
"HGraph"->>"LabelTable": GetIdByLabel(id)
"HGraph"->>"label_lookup_mutex_": Acquire scoped_lock
"HGraph"->>"RouteGraph": DeleteNeighborsById(inner_id) (for each level)
"HGraph"->>"bottom_graph": DeleteNeighborsById(inner_id)
"HGraph"->>"LabelTable": Remove(id)
"HGraph"->>"HGraph": delete_count_++
"HGraph"->>"label_lookup_mutex_": Release scoped_lock
"HGraph"-->>"Caller": return true
Loading

File-Level Changes

Change Details Files
Add end-to-end concurrent add-search-remove tests
  • Introduce TestHGraphConcurrentAddSearchRemove to cover concurrent add/search/remove flows
  • Define PR and Daily TEST_CASEs invoking the new test helper
  • Implement TestConcurrentAddSearchRemove in TestIndex to orchestrate multithreaded operations
tests/test_hgraph.cpp
tests/test_index.h
tests/test_index.cpp
Make HGraph::Remove thread-safe and enable concurrent remove feature
  • Register SUPPORT_ADD_SEARCH_DELETE_CONCURRENT in InitFeatures
  • Protect neighbor deletion, bottom graph cleanup, and label removal under label_lookup_mutex_ lock in Remove
src/algorithm/hgraph.cpp
Refine test constant declaration
  • Change search_param_tmp2 to constexpr static const char* for consistency
tests/test_brute_force.cpp

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@gemini-code-assist
Copy link

Summary of Changes

Hello @inabao, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses the need for thread-safe removal operations within the HGraph component. It introduces a mechanism to protect critical sections during element removal, preventing race conditions and ensuring data integrity in concurrent environments. Furthermore, it significantly expands the testing suite by adding dedicated tests for simultaneous add, search, and delete operations, thereby bolstering the reliability of the HGraph index.

Highlights

  • Thread-Safe Removal: The HGraph::Remove method has been updated to be thread-safe by introducing a std::scoped_lock around the critical section where graph structures and the label table are modified.
  • New Concurrency Feature Flag: A new feature flag, IndexFeature::SUPPORT_ADD_SEARCH_DELETE_CONCURRENT, has been added to HGraph to indicate its support for concurrent add, search, and delete operations.
  • Comprehensive Concurrent Testing: New test cases and infrastructure have been implemented to thoroughly validate the concurrent execution of add, search, and remove operations on the HGraph index, ensuring its robustness under multi-threaded workloads.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist
Copy link

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Consider wrapping the temporary vsag::Options::block_size_limit changes in a RAII guard to ensure the original limit is always restored, even if a test throws.
  • Right now only the neighbor-deletion block is locked in HGraph::Remove, but the entry_point handling and route_graphs_ pop_back happens outside the lock—consider moving the mutex to cover the entire Remove() method for full thread safety.
  • TestConcurrentAddSearchRemove declares an expected_success parameter but never uses it; either implement failure scenarios or remove the parameter to keep the signature accurate.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Consider wrapping the temporary vsag::Options::block_size_limit changes in a RAII guard to ensure the original limit is always restored, even if a test throws.
- Right now only the neighbor-deletion block is locked in HGraph::Remove, but the entry_point handling and route_graphs_ pop_back happens outside the lock—consider moving the mutex to cover the entire Remove() method for full thread safety.
- TestConcurrentAddSearchRemove declares an expected_success parameter but never uses it; either implement failure scenarios or remove the parameter to keep the signature accurate.

## Individual Comments

### Comment 1
<location> `src/algorithm/hgraph.cpp:1641-1640` </location>
<code_context>
-    for (int level = static_cast<int>(route_graphs_.size()) - 1; level >= 0; --level) {
-        this->route_graphs_[level]->DeleteNeighborsById(inner_id);
+    {
+        std::scoped_lock label_lock(this->label_lookup_mutex_);
+        for (int level = static_cast<int>(route_graphs_.size()) - 1; level >= 0; --level) {
+            this->route_graphs_[level]->DeleteNeighborsById(inner_id);
+        }
</code_context>

<issue_to_address>
**issue (bug_risk):** Locking only label_lookup_mutex_ may not protect all shared resources.

Since route_graphs_, bottom_graph_, and label_table_ are also modified, review concurrent access to these objects and add appropriate locking if necessary to avoid data races.
</issue_to_address>

### Comment 2
<location> `tests/test_hgraph.cpp:1354-1355` </location>
<code_context>
+                auto index = TestIndex::TestFactory(test_index->name, param, true);
+                auto dataset = HGraphTestIndex::pool.GetDatasetAndCreate(
+                    dim, resource->base_count, metric_type);
+                // Execute build test
+                TestIndex::TestConcurrentAddSearchRemove(index, dataset, search_param, true);
+                // Restore original block size limit
+                vsag::Options::Instance().set_block_size_limit(origin_size);
</code_context>

<issue_to_address>
**suggestion (testing):** Missing negative test cases for concurrent remove operations.

Add tests for failed concurrent remove operations, such as removing non-existent IDs, double-removal, and simultaneous removal of the same ID, to verify error handling and thread safety.

Suggested implementation:

```cpp
                // Execute build test
                TestIndex::TestConcurrentAddSearchRemove(index, dataset, search_param, true);

                // Negative test cases for concurrent remove operations
                {
                    // 1. Remove non-existent IDs
                    std::vector<size_t> non_existent_ids = {999999, 888888, 777777};
                    std::vector<std::future<bool>> futures;
                    for (auto id : non_existent_ids) {
                        futures.push_back(std::async(std::launch::async, [index, id]() {
                            return index->Remove(id); // Should fail
                        }));
                    }
                    for (auto& fut : futures) {
                        REQUIRE_FALSE(fut.get());
                    }

                    // 2. Double-removal of the same ID
                    size_t valid_id = 0;
                    REQUIRE(index->Remove(valid_id)); // First removal should succeed
                    REQUIRE_FALSE(index->Remove(valid_id)); // Second removal should fail

                    // 3. Simultaneous removal of the same ID
                    size_t concurrent_id = 1;
                    // Re-add the ID for this test if needed
                    index->Add(dataset->Get(concurrent_id), concurrent_id);
                    std::atomic<int> success_count{0};
                    std::vector<std::thread> threads;
                    for (int i = 0; i < 4; ++i) {
                        threads.emplace_back([&]() {
                            if (index->Remove(concurrent_id)) {
                                success_count++;
                            }
                        });
                    }
                    for (auto& t : threads) t.join();
                    // Only one thread should succeed in removing
                    REQUIRE(success_count == 1);
                }

                // Restore original block size limit
                vsag::Options::Instance().set_block_size_limit(origin_size);

```

- If `Remove` and `Add` methods do not exist or have different signatures, you will need to adjust the calls accordingly.
- If your index implementation returns error codes or throws exceptions instead of boolean, update the assertions to match.
- If your dataset does not support `Get(id)`, replace with the correct method to retrieve an item by ID.
- If you use a custom thread pool or async framework, adapt the threading code to your conventions.
</issue_to_address>

### Comment 3
<location> `tests/test_index.cpp:2427-2436` </location>
<code_context>
+    fixtures::ThreadPool pool(5);
+    std::vector<std::future<bool>> futures;
+
+    auto func = [&](uint64_t i) -> bool {
+        auto data_one = vsag::Dataset::Make();
+        data_one->Dim(dim)
+            ->Ids(dataset->base_->GetIds() + i)
+            ->NumElements(1)
+            ->Paths(dataset->base_->GetPaths() + i)
+            ->Float32Vectors(dataset->base_->GetFloat32Vectors() + i * dim)
+            ->SparseVectors(dataset->base_->GetSparseVectors() + i)
+            ->Owner(false);
+        auto add_index = index->Add(data_one);
+        auto search_index = index->KnnSearch(data_one, 1, search_param);
+        auto remove_index = index->Remove(*(dataset->base_->GetIds() + i));
+        return add_index.has_value() & search_index.has_value() & remove_index.has_value();
+    };
+
</code_context>

<issue_to_address>
**nitpick:** Bitwise AND used for boolean logic in test assertions.

Replace '&' with '&&' to improve readability and prevent potential issues if the return types change from bool.
</issue_to_address>

### Comment 4
<location> `tests/test_index.cpp:2446-2447` </location>
<code_context>
+
+    for (auto& res : futures) {
+        auto val = res.get();
+        REQUIRE(val);
+    }
+}
+
</code_context>

<issue_to_address>
**suggestion (testing):** Test does not verify the actual state of the index after concurrent removals.

Please add assertions to confirm that removed elements cannot be found in the index after removal.

```suggestion
    for (auto& res : futures) {
        auto val = res.get();
        REQUIRE(val);
    }

    // Verify that removed elements cannot be found in the index
    for (uint64_t j = temp_count; j < base_count; ++j) {
        auto id = *(dataset->base_->GetIds() + j);
        auto search_result = index->SearchById(id);
        REQUIRE(!search_result.has_value());
    }
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +1354 to +1355
// Execute build test
TestIndex::TestConcurrentAddSearchRemove(index, dataset, search_param, true);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Missing negative test cases for concurrent remove operations.

Add tests for failed concurrent remove operations, such as removing non-existent IDs, double-removal, and simultaneous removal of the same ID, to verify error handling and thread safety.

Suggested implementation:

                // Execute build test
                TestIndex::TestConcurrentAddSearchRemove(index, dataset, search_param, true);

                // Negative test cases for concurrent remove operations
                {
                    // 1. Remove non-existent IDs
                    std::vector<size_t> non_existent_ids = {999999, 888888, 777777};
                    std::vector<std::future<bool>> futures;
                    for (auto id : non_existent_ids) {
                        futures.push_back(std::async(std::launch::async, [index, id]() {
                            return index->Remove(id); // Should fail
                        }));
                    }
                    for (auto& fut : futures) {
                        REQUIRE_FALSE(fut.get());
                    }

                    // 2. Double-removal of the same ID
                    size_t valid_id = 0;
                    REQUIRE(index->Remove(valid_id)); // First removal should succeed
                    REQUIRE_FALSE(index->Remove(valid_id)); // Second removal should fail

                    // 3. Simultaneous removal of the same ID
                    size_t concurrent_id = 1;
                    // Re-add the ID for this test if needed
                    index->Add(dataset->Get(concurrent_id), concurrent_id);
                    std::atomic<int> success_count{0};
                    std::vector<std::thread> threads;
                    for (int i = 0; i < 4; ++i) {
                        threads.emplace_back([&]() {
                            if (index->Remove(concurrent_id)) {
                                success_count++;
                            }
                        });
                    }
                    for (auto& t : threads) t.join();
                    // Only one thread should succeed in removing
                    REQUIRE(success_count == 1);
                }

                // Restore original block size limit
                vsag::Options::Instance().set_block_size_limit(origin_size);
  • If Remove and Add methods do not exist or have different signatures, you will need to adjust the calls accordingly.
  • If your index implementation returns error codes or throws exceptions instead of boolean, update the assertions to match.
  • If your dataset does not support Get(id), replace with the correct method to retrieve an item by ID.
  • If you use a custom thread pool or async framework, adapt the threading code to your conventions.

Comment on lines +2427 to +2436
auto func = [&](uint64_t i) -> bool {
auto data_one = vsag::Dataset::Make();
data_one->Dim(dim)
->Ids(dataset->base_->GetIds() + i)
->NumElements(1)
->Paths(dataset->base_->GetPaths() + i)
->Float32Vectors(dataset->base_->GetFloat32Vectors() + i * dim)
->SparseVectors(dataset->base_->GetSparseVectors() + i)
->Owner(false);
auto add_index = index->Add(data_one);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Bitwise AND used for boolean logic in test assertions.

Replace '&' with '&&' to improve readability and prevent potential issues if the return types change from bool.

@codecov
Copy link

codecov bot commented Oct 22, 2025

Codecov Report

❌ Patch coverage is 80.00000% with 2 lines in your changes missing coverage. Please review.

@@            Coverage Diff             @@
##             main    #1269      +/-   ##
==========================================
- Coverage   92.16%   92.08%   -0.08%     
==========================================
  Files         314      315       +1     
  Lines       18930    17513    -1417     
==========================================
- Hits        17446    16127    -1319     
+ Misses       1484     1386      -98     
Flag Coverage Δ
cpp 92.08% <80.00%> (-0.08%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
common 91.78% <ø> (+0.30%) ⬆️
datacell 92.89% <ø> (-0.15%) ⬇️
index 91.24% <80.00%> (+0.15%) ⬆️
simd 100.00% <ø> (ø)

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 06452d4...91c289d. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: jinjiabao.jjb <[email protected]>
Signed-off-by: jinjiabao.jjb <[email protected]>
Signed-off-by: jinjiabao.jjb <[email protected]>
Signed-off-by: jinjiabao.jjb <[email protected]>
Copy link
Collaborator

@wxyucs wxyucs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Collaborator

@ShawnShawnYou ShawnShawnYou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Collaborator

@LHT129 LHT129 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@inabao inabao merged commit 5ee78d9 into main Oct 27, 2025
24 checks passed
@inabao inabao deleted the remove-safe branch October 27, 2025 03:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

make sure remove a point of HGraph is thread-safe.

5 participants