Skip to content

Commit c70fe4a

Browse files
committed
feature(scale_test): scale tests without payload
New scale tests have been developed to reproduce and validate the issues identified below. The current implementation of the LongevityTest does not support extended execution without a workload. To address this limitation, a new ScaleClusterTest has been introduced. This test allows for the execution of tests without workloads in various scenarios: - Initializing a large cluster to a specified target size (e.g., from 10 to 100 nodes). - Scaling down the cluster to a desired size (e.g., from 100 to 10 nodes). - Creating a large number of keyspaces and tables with predefined columns or utilizing the cs-profile-template. - Running tests with Nemesis without any payload, with a duration specified using the new 'idle_duration' parameter. The development of these new tests was aimed at simplifying the complexity associated with the LongevityTest object and ensuring compatibility with future scale testing efforts using Kubernetes (K8s), Docker, and other cloud providers. Refs: scylladb/scylladb#24790, scylladb/scylla-enterprise#5626, scylladb/scylla-enterprise#5624
1 parent 8ded9b5 commit c70fe4a

9 files changed

+347
-0
lines changed

data_dir/templated_100_table.yaml

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
### DML ###
2+
3+
# Keyspace Name
4+
keyspace: testing_keyspaces
5+
6+
# The CQL for creating a keyspace (optional if it already exists)
7+
keyspace_definition: |
8+
CREATE KEYSPACE testing_keyspaces WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor': 3} AND durable_writes = true;
9+
10+
# Table name
11+
table: ${table_name}
12+
13+
# The CQL for creating a table you wish to stress (optional if it already exists)
14+
table_definition: |
15+
CREATE TABLE testing_keyspaces.${table_name} (
16+
key1 bigint,
17+
key2 text,
18+
clustering1 bigint,
19+
clustering2 timeuuid,
20+
column1 text,
21+
column2 int, PRIMARY KEY ((key1, key2), clustering1, clustering2)
22+
) WITH bloom_filter_fp_chance = 0.01
23+
AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
24+
AND comment = ''
25+
AND compaction = {'class': 'SizeTieredCompactionStrategy'}
26+
AND compression = {}
27+
AND crc_check_chance = 1.0
28+
AND dclocal_read_repair_chance = 0.1
29+
AND default_time_to_live = 0
30+
AND gc_grace_seconds = 864000
31+
AND max_index_interval = 2048
32+
AND memtable_flush_period_in_ms = 0
33+
AND min_index_interval = 128
34+
AND read_repair_chance = 0.0
35+
AND speculative_retry = '99.0PERCENTILE';
36+
37+
38+
# extra_definitions:
39+
# - CREATE INDEX IF NOT EXISTS ${table_name}_field4_${table_name} ON feeds.${table_name} (field4);
40+
41+
# ### Column Distribution Specifications ###
42+
43+
# ### Batch Ratio Distribution Specifications ###
44+
# insert:
45+
# partitions: fixed(1)
46+
# select: fixed(1)/1000
47+
# batchtype: UNLOGGED
48+
49+
# #
50+
# # A list of queries you wish to run against the schema
51+
# #
52+
# queries:
53+
# read1:
54+
# cql: SELECT * FROM feeds.${table_name} WHERE field1 = ?
55+
# fields: samerow
56+
57+
# Run stress
58+
# cassandra-stress user profile={} cl=QUORUM 'ops(insert=1, read1=5)' duration={} -rate threads=2 -errors ignore
59+
60+
# customer wish (different than what we are using!)
61+
# "INSERT INTO short (k,time,data) values (?,?,?) USING TTL ?"
62+
# "SELECT * FROM short WHERE name = ? AND time >= ? AND time < ?"

defaults/test_default.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
db_type: "scylla"
22

33
test_duration: 60
4+
idle_duration: 0
45
prepare_stress_duration: 300 # 5 hours
56
stress_duration: 0
67

docs/configuration_options.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,15 @@ Test duration (min). Parameter used to keep instances produced by tests<br>and f
4242
**type:** int
4343

4444

45+
## **idle_duration** / SCT_IDLE_DURATION
46+
47+
Idle duration (min). Parameter used to run test without any workload
48+
49+
**default:** N/A
50+
51+
**type:** int
52+
53+
4554
## **prepare_stress_duration** / SCT_PREPARE_STRESS_DURATION
4655

4756
Time in minutes, which is required to run prepare stress commands<br>defined in prepare_*_cmd for dataset generation, and is used in<br>test duration calculation
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
#!groovy
2+
3+
// trick from https://github.com/jenkinsci/workflow-cps-global-lib-plugin/pull/43
4+
def lib = library identifier: 'sct@snapshot', retriever: legacySCM(scm)
5+
6+
longevityPipeline(
7+
backend: 'aws',
8+
region: '''["eu-west-1","eu-west-2"]''',
9+
availability_zone: 'a,b,c',
10+
test_name: 'scale_cluster_test.ScaleClusterTest.test_no_workloads_idle_custom_time',
11+
test_config: 'test-cases/scale/scale-multi-dc-100-empty-tables-cluster-resize.yaml',
12+
)
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
#!groovy
2+
3+
// trick from https://github.com/jenkinsci/workflow-cps-global-lib-plugin/pull/43
4+
def lib = library identifier: 'sct@snapshot', retriever: legacySCM(scm)
5+
6+
longevityPipeline(
7+
backend: 'aws',
8+
region: 'eu-west-1',
9+
test_name: 'scale_cluster_test.ScaleClusterTest.test_grow_shrink_cluster',
10+
test_config: 'test-cases/scale/scale-20-200-20-cluster-resize.yaml',
11+
)

scale_cluster_test.py

Lines changed: 174 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,174 @@
1+
# This program is distributed in the hope that it will be useful,
2+
# but WITHOUT ANY WARRANTY; without even the implied warranty of
3+
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
4+
#
5+
# See LICENSE for more details.
6+
#
7+
# Copyright (c) 2016 ScyllaDB
8+
9+
import time
10+
11+
from longevity_test import LongevityTest
12+
from sdcm.utils.adaptive_timeouts import adaptive_timeout, Operations
13+
from sdcm.utils.cluster_tools import group_nodes_by_dc_idx
14+
from sdcm.sct_events.system import InfoEvent
15+
from sdcm.sct_events import Severity
16+
from sdcm.cluster import MAX_TIME_WAIT_FOR_NEW_NODE_UP, BaseScyllaCluster
17+
18+
19+
class ScaleClusterTest(LongevityTest):
20+
@staticmethod
21+
def is_target_reached(current: list[int], target: list[int]) -> bool:
22+
""" Check that cluster size reached target size in each dc"""
23+
return all([x >= y for x, y in zip(current, target)])
24+
25+
@staticmethod
26+
def init_nodes(db_cluster: BaseScyllaCluster):
27+
"""
28+
method is required to be rewritten to support setup large clusters.
29+
"""
30+
db_cluster.set_seeds(first_only=True)
31+
db_cluster.wait_for_init(node_list=db_cluster.nodes, timeout=MAX_TIME_WAIT_FOR_NEW_NODE_UP)
32+
db_cluster.set_seeds()
33+
db_cluster.update_seed_provider()
34+
35+
@property
36+
def cluster_target_size(self) -> list[int]:
37+
cluster_target_size = self.params.get('cluster_target_size')
38+
if not cluster_target_size:
39+
return []
40+
return list(map(int, cluster_target_size.split())) if isinstance(cluster_target_size, str) else [cluster_target_size]
41+
42+
def grow_to_cluster_target_size(self, cluster_target_size: list[int]):
43+
""" Bootstrap node in each dc in each rack while cluster size less than target size"""
44+
nodes_by_dcx = group_nodes_by_dc_idx(self.db_cluster.data_nodes)
45+
current_cluster_size = [len(nodes_by_dcx[dcx]) for dcx in sorted(nodes_by_dcx)]
46+
if self.is_target_reached(current_cluster_size, cluster_target_size):
47+
self.log.debug("Cluster has required size, no need to grow")
48+
return
49+
InfoEvent(
50+
message=f"Starting to grow cluster from {current_cluster_size} to {cluster_target_size}").publish()
51+
52+
add_node_cnt = self.params.get('add_node_cnt')
53+
try:
54+
while not self.is_target_reached(current_cluster_size, cluster_target_size):
55+
for dcx, target in enumerate(cluster_target_size):
56+
if current_cluster_size[dcx] >= target:
57+
continue
58+
add_nodes_num = add_node_cnt if (
59+
target - current_cluster_size[dcx]) >= add_node_cnt else target - current_cluster_size[dcx]
60+
61+
for rack in range(self.db_cluster.racks_count):
62+
added_nodes = []
63+
InfoEvent(
64+
message=f"Adding next number of nodes {add_nodes_num} to dc_idx {dcx} and rack {rack}").publish()
65+
added_nodes.extend(self.db_cluster.add_nodes(
66+
count=add_nodes_num, enable_auto_bootstrap=True, dc_idx=dcx, rack=rack))
67+
self.monitors.reconfigure_scylla_monitoring()
68+
up_timeout = MAX_TIME_WAIT_FOR_NEW_NODE_UP
69+
with adaptive_timeout(Operations.NEW_NODE, node=self.db_cluster.data_nodes[0], timeout=up_timeout):
70+
self.db_cluster.wait_for_init(
71+
node_list=added_nodes, timeout=up_timeout, check_node_health=False)
72+
self.db_cluster.wait_for_nodes_up_and_normal(nodes=added_nodes)
73+
InfoEvent(f"New nodes up and normal {[node.name for node in added_nodes]}").publish()
74+
nodes_by_dcx = group_nodes_by_dc_idx(self.db_cluster.data_nodes)
75+
current_cluster_size = [len(nodes_by_dcx[dcx]) for dcx in sorted(nodes_by_dcx)]
76+
finally:
77+
nodes_by_dcx = group_nodes_by_dc_idx(self.db_cluster.data_nodes)
78+
current_cluster_size = [len(nodes_by_dcx[dcx]) for dcx in sorted(nodes_by_dcx)]
79+
InfoEvent(message=f"Grow cluster finished, cluster size is {current_cluster_size}").publish()
80+
81+
def shrink_to_cluster_target_size(self, cluster_target_size: list[int]):
82+
"""Decommission node in each dc in each rack while cluster size more than target size"""
83+
nodes_by_dcx = group_nodes_by_dc_idx(self.db_cluster.data_nodes)
84+
current_cluster_size = [len(nodes_by_dcx[dcx]) for dcx in sorted(nodes_by_dcx)]
85+
if self.is_target_reached(cluster_target_size, current_cluster_size):
86+
self.log.debug("Cluster has required size, no need to shrink")
87+
return
88+
InfoEvent(
89+
message=f"Starting to shrink cluster from {current_cluster_size} to {cluster_target_size}").publish()
90+
try:
91+
nodes_by_dcx = group_nodes_by_dc_idx(self.db_cluster.data_nodes)
92+
while not self.is_target_reached(cluster_target_size, current_cluster_size):
93+
for dcx, _ in enumerate(current_cluster_size):
94+
nodes_by_racks = self.db_cluster.get_nodes_per_datacenter_and_rack_idx(nodes_by_dcx[dcx])
95+
for nodes in nodes_by_racks.values():
96+
decommissioning_node = nodes[-1]
97+
decommissioning_node.running_nemesis = "Decommissioning node"
98+
self.db_cluster.decommission(node=decommissioning_node, timeout=7200)
99+
nodes_by_dcx = group_nodes_by_dc_idx(self.db_cluster.data_nodes)
100+
current_cluster_size = [len(nodes_by_dcx[dcx]) for dcx in sorted(nodes_by_dcx)]
101+
finally:
102+
nodes_by_dcx = group_nodes_by_dc_idx(self.db_cluster.data_nodes)
103+
current_cluster_size = [len(nodes_by_dcx[dcx]) for dcx in sorted(nodes_by_dcx)]
104+
InfoEvent(
105+
message=f"Reached cluster size {current_cluster_size}").publish()
106+
107+
def create_schema(self):
108+
number_of_table = self.params.get(
109+
'user_profile_table_count') or 0
110+
cs_user_profiles = self.params.get('cs_user_profiles')
111+
keyspace_num = self.params.get('keyspace_num')
112+
if not number_of_table and not cs_user_profiles:
113+
self.log.debug("User schema will not be created")
114+
return
115+
if not cs_user_profiles:
116+
region_dc_names = self.db_cluster.get_datacenter_name_per_region(self.db_cluster.nodes)
117+
replication_factor = self.db_cluster.racks_count
118+
InfoEvent("Create keyspace and 100 empty tables").publish()
119+
for i in range(1, keyspace_num + 1):
120+
self.create_keyspace(keyspace_name=f"testing_keyspace_{i}", replication_factor={
121+
dc_name: replication_factor for dc_name in region_dc_names.values()})
122+
for j in range(1, number_of_table + 1):
123+
self.create_table(name=f"table_{j}", keyspace_name=f"testing_keyspace_{i}")
124+
InfoEvent(f"{keyspace_num} Keyspaces and {number_of_table} tables were created").publish()
125+
else:
126+
self._pre_create_templated_user_schema()
127+
128+
def test_grow_shrink_cluster(self):
129+
"""
130+
Test allow to test cluster reaching target size by growing and shrinking.
131+
1. Create schema if needed
132+
2. Grow cluster to target size
133+
3. if bootstrap failed during grow, try to shrink cluster to initial size
134+
4. If shrink failed during step 3, just log error and finish test
135+
136+
"""
137+
nodes_by_dcx = group_nodes_by_dc_idx(self.db_cluster.data_nodes)
138+
init_cluster_size = [len(nodes_by_dcx[dcx]) for dcx in sorted(nodes_by_dcx)]
139+
InfoEvent(message=f"Cluster size is {init_cluster_size}").publish()
140+
self.create_schema()
141+
try:
142+
InfoEvent("Start grow cluster").publish()
143+
self.grow_to_cluster_target_size()
144+
except Exception as ex: # noqa: BLE001
145+
self.log.error(f"Failed to grow cluster: {ex}")
146+
InfoEvent(f"Grow cluster failed with error: {ex}", severity=Severity.ERROR).publish()
147+
148+
try:
149+
InfoEvent("Start shrink cluster").publish()
150+
self.shrink_to_cluster_target_size(init_cluster_size)
151+
except Exception as ex: # noqa: BLE001
152+
self.log.error(f"Failed to shrink cluster: {ex}")
153+
InfoEvent(f"Shrink cluster failed with error: {ex}", severity=Severity.ERROR).publish()
154+
nodes_by_dcx = group_nodes_by_dc_idx(self.db_cluster.data_nodes)
155+
current_cluster_size = [len(nodes_by_dcx[dcx]) for dcx in sorted(nodes_by_dcx)]
156+
InfoEvent(message=f"Cluster size is {current_cluster_size}").publish()
157+
158+
assert current_cluster_size == init_cluster_size, f"Cluster size {current_cluster_size} is not equal to initial {init_cluster_size}"
159+
160+
def test_no_workloads_idle_custom_time(self):
161+
"""
162+
The aim of test is nemesis execution without any workload
163+
with configured user schema during idle_duration time.
164+
"""
165+
self.create_schema()
166+
self.grow_to_cluster_target_size(self.cluster_target_size)
167+
self.db_cluster.add_nemesis(nemesis=self.get_nemesis_class(), tester_obj=self)
168+
self.db_cluster.start_nemesis()
169+
duration = self.params.get('idle_duration')
170+
InfoEvent(f"Wait {duration} minutes while cluster resizing").publish()
171+
time.sleep(duration * 60)
172+
173+
self.shrink_to_cluster_target_size(self.params.total_db_nodes)
174+
InfoEvent("Test done").publish()

sdcm/sct_config.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -257,6 +257,8 @@ class SCTConfiguration(dict):
257257
Test duration (min). Parameter used to keep instances produced by tests
258258
and for jenkins pipeline timeout and TimoutThread.
259259
"""),
260+
dict(name="idle_duration", env="SCT_IDLE_DURATION", type=int,
261+
help="""Idle duration (min). Parameter used to run test without any workload"""),
260262
dict(name="prepare_stress_duration", env="SCT_PREPARE_STRESS_DURATION", type=int,
261263
help="""
262264
Time in minutes, which is required to run prepare stress commands
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
test_duration: 6000
2+
3+
keyspace_num: 0
4+
user_profile_table_count: 0
5+
add_cs_user_profiles_extra_tables: true
6+
7+
n_loaders: 0
8+
n_db_nodes: 20
9+
add_node_cnt: 1
10+
cluster_target_size: 200
11+
12+
instance_type_db: 'i4i.2xlarge'
13+
instance_type_monitor: 'm6i.xlarge'
14+
root_disk_size_monitor: 1000
15+
16+
nemesis_class_name: 'NoOpMonkey'
17+
18+
# This is in order to start the basic cluster faster
19+
use_legacy_cluster_init: false
20+
parallel_node_operations: true
21+
seeds_num: 2
22+
# Takes too long on big clusters
23+
cluster_health_check: false
24+
25+
backtrace_decoding: false
26+
27+
append_scylla_yaml:
28+
enable_repair_based_node_ops: true
29+
30+
run_fullscan: []
31+
32+
simulated_racks: 0
33+
instance_type_runner: 'c7i.16xlarge'
34+
root_disk_size_runner: 1000
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
test_duration: 4000
2+
idle_duration: 180
3+
4+
# cs_user_profiles:
5+
# - data_dir/templated_100_table.yaml
6+
user_profile_table_count: 100
7+
add_cs_user_profiles_extra_tables: true
8+
keyspace_num: 1
9+
10+
n_loaders: 0
11+
n_db_nodes: "60 60"
12+
add_node_cnt: 1
13+
round_robin: true
14+
15+
instance_type_db: 'i4i.2xlarge'
16+
instance_type_loader: 'c7i.4xlarge'
17+
instance_type_monitor: 'm6i.xlarge'
18+
root_disk_size_monitor: 2000
19+
20+
21+
# decommission 'add_node_cnt' number of nodes and add the same number of nodes
22+
nemesis_class_name: 'DecommissionMonkey'
23+
# as fast as possible including health checks
24+
nemesis_interval: 1
25+
26+
# This is in order to start the basic cluster faster
27+
use_legacy_cluster_init: false
28+
parallel_node_operations: true
29+
seeds_num: 5
30+
# Takes too long on big clusters
31+
cluster_health_check: false
32+
33+
backtrace_decoding: false
34+
35+
append_scylla_yaml:
36+
enable_repair_based_node_ops: true
37+
38+
run_fullscan: []
39+
40+
simulated_racks: 0
41+
instance_type_runner: 'c7i.8xlarge'
42+
root_disk_size_runner: 3000

0 commit comments

Comments
 (0)