Skip to content

Conversation

@QuerthDP
Copy link
Member

@QuerthDP QuerthDP commented Dec 2, 2025

Add support for 3 Vector Store nodes for testing of more production alike environments.

Fixes: VECTOR-157

@github-actions github-actions bot added P2 and removed P2 labels Dec 2, 2025
Copilot finished reviewing on behalf of QuerthDP December 2, 2025 16:15
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for 3 Vector Store nodes in the Vector Store Validator test infrastructure to enable testing of production-like high availability environments. The changes refactor the API to use a new ScyllaNodeConfig struct that encapsulates node configuration including primary and secondary Vector Store URIs, replacing the previous single Vector Store URL approach.

Key changes:

  • Introduced ScyllaNodeConfig struct to configure each ScyllaDB node with primary and secondary Vector Store URIs
  • Updated all test infrastructure to support 3 DB nodes and 3 VS nodes with configurable URI mappings
  • Added new high_availability test suite to verify secondary URI failover functionality

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
crates/validator-tests/src/scylla_cluster.rs Added ScyllaNodeConfig struct and updated ScyllaCluster enum to accept node configs instead of simple URI strings
crates/validator-tests/src/lib.rs Exported the new ScyllaNodeConfig struct
crates/validator-tests/src/common.rs Updated constants for 3 VS nodes, added helper functions for node configs, and updated init/cleanup to handle multiple DNS entries
crates/validator-engine/src/scylla_cluster.rs Refactored node startup to use ScyllaNodeConfig, updated command-line argument handling to support primary/secondary URIs
crates/validator-vector-store/src/reconnect.rs Updated tests to use new get_default_node_configs API and restart method with node config
crates/validator-vector-store/src/memory_limit.rs Updated to use new VS octet constants and node config API
crates/validator-vector-store/src/lib.rs Added new high_availability test module to test suite
crates/validator-vector-store/src/high_availability.rs New test file implementing secondary URI failover testing with 1 primary and 2 secondary configurations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@QuerthDP
Copy link
Member Author

QuerthDP commented Dec 4, 2025

Changelog:

@QuerthDP
Copy link
Member Author

QuerthDP commented Dec 4, 2025

Changelog:

  • rename vs_urls to default_vs_urls in get_default_scylla_node_configs

@QuerthDP
Copy link
Member Author

QuerthDP commented Dec 5, 2025

Changelog:

  • changed log level of hickory_server to dismiss DNS register logs in default visibility level
  • added support for 3 node Vector Store cluster

Copilot finished reviewing on behalf of QuerthDP December 5, 2025 10:04
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@QuerthDP
Copy link
Member Author

QuerthDP commented Dec 5, 2025

Changelog:

  • refactor serde.rs test significantly to use only one loop (main logic unchanged)
  • added timeout for wait_for_node (VS)
  • removed busy wait in full_scan.rs
  • changed test_secondary_uri_works_correctly to only use one VS instance

@QuerthDP QuerthDP marked this pull request as ready for review December 5, 2025 11:04
@github-actions github-actions bot removed the P2 label Dec 5, 2025
@QuerthDP
Copy link
Member Author

QuerthDP commented Dec 5, 2025

Changelog:

  • add prepare_connection_with_vs_ips function to create only subset of clients.
  • add test for VS cluster restart

@QuerthDP
Copy link
Member Author

QuerthDP commented Dec 5, 2025

Changelog:

  • fix cargo machete

Copy link
Collaborator

@smoczy123 smoczy123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good. I have one question: why do we support only 3 node clusters? I think it would make sense to support a variable (probably reasonably bounded) number of nodes in a cluster

@QuerthDP
Copy link
Member Author

QuerthDP commented Dec 5, 2025

Generally looks good. I have one question: why do we support only 3 node clusters? I think it would make sense to support a variable (probably reasonably bounded) number of nodes in a cluster

The default is 3 node (both Scylla and Vector Store), but it is configurable in custom setup (f.e. look high_availability.rs)

@swasik
Copy link
Collaborator

swasik commented Dec 5, 2025

Generally looks good. I have one question: why do we support only 3 node clusters? I think it would make sense to support a variable (probably reasonably bounded) number of nodes in a cluster

The default is 3 node (both Scylla and Vector Store), but it is configurable in custom setup (f.e. look high_availability.rs)

Also 3 is what we enforce in XCloud and recommend in other scenarios.

Copy link
Collaborator

@ewienik ewienik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR should be rebased to the master

@QuerthDP
Copy link
Member Author

QuerthDP commented Dec 8, 2025

Changelog:

  • rebased on master
  • removed up/down functions for Vector Store cluster
  • renamed up_node, down_node to start_node, stop_node
  • fixed memory_limit test

@QuerthDP QuerthDP requested a review from smoczy123 December 8, 2025 10:40
@github-actions github-actions bot added P2 and removed P2 labels Dec 8, 2025
@QuerthDP
Copy link
Member Author

QuerthDP commented Dec 8, 2025

Changelog:

  • fix commit message not to mention up, down functions

Move ScyllaDB nodes IP last octets from 2-254 to 1-127.
Move Vector Store node IP last octets from 1 to 128-254.
This setup supports adding more ScyllaDB and Vector Store
nodes in testing if needed.

Changed `VS_NAME` to a list of available DNS names for Vector Store
nodes. Now the names are `vs1`, `vs2`, `vs3`, and more if needed.

Renamed `get_default_vs_url` function to `get_default_vs_urls`
and adjusted it to return URLs of three default Vector Store nodes.

Added `get_default_vs_ips` function to return vector of three
default Vector Store node IPs (not to confuse with DNS names).

Adjusted `get_default_node_configs` so that now the default nodes
does support one primary Vector Store node and two secondary ones.
Since registering multiple Vector Store nodes in the DNS server
the logs are getting spammed by the messages of having them registered.
Let the messages be visible only on `--verbose` mode.
You can also turn them on by setting `RUST_LOG` env var to INFO or higher.
Mention ScyllaDB node in tracing managing ScyllaDB cluster.
Rename `node_configs` to `scylla_configs` in `common.rs`.
Implemented necessary Vector Store cluster management interface
similar to the one used for ScyllaDB cluster.

Added `start_node`, `stop_node`, `restart` operations.

From this patch, custom Vector Store node config is available
including number of nodes, node IPs, DB IPs which node connect to,
and environmental variable sets.

Adjust existing tests to support 3 Vector Store node architecture,
especially waiting for all nodes to have their indices created.
Add test restarting VS cluster and checking if the restart
does not break ScyllaDB connection to VS nodes.
@QuerthDP
Copy link
Member Author

QuerthDP commented Dec 8, 2025

Changelog:

  • rebased on master

@ewienik ewienik added this pull request to the merge queue Dec 8, 2025
Merged via the queue into scylladb:master with commit fdb4e7f Dec 8, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants