-
Notifications
You must be signed in to change notification settings - Fork 50
nexus-db-queries: verify blueprint test covers all tables (#8455) #8667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ter#8455) Add ensure_blueprint_fully_populated() to check all bp_* tables get populated. Add ClickHouse config to representative blueprint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a swing at this.
macro_rules! query_count { | ||
($table:ident, $blueprint_id_col:ident) => {{ | ||
use nexus_db_schema::schema::$table::dsl; | ||
let result = dsl::$table | ||
.filter( | ||
dsl::$blueprint_id_col | ||
.eq(to_db_typed_uuid(blueprint_id)), | ||
) | ||
.count() | ||
.get_result_async(&*conn) | ||
.await; | ||
(stringify!($table), result) | ||
}}; | ||
} | ||
|
||
// Tables prefixed with `bp_` that are *not* specific to a single blueprint | ||
// and therefore intentionally ignored. | ||
let tables_ignored: BTreeSet<_> = ["bp_target"].into_iter().collect(); | ||
|
||
let mut tables_checked = BTreeSet::new(); | ||
for (table_name, result) in [ | ||
query_count!(blueprint, id), | ||
query_count!(bp_sled_metadata, blueprint_id), | ||
query_count!(bp_omicron_dataset, blueprint_id), | ||
query_count!(bp_omicron_physical_disk, blueprint_id), | ||
query_count!(bp_omicron_zone, blueprint_id), | ||
query_count!(bp_omicron_zone_nic, blueprint_id), | ||
query_count!(bp_clickhouse_cluster_config, blueprint_id), | ||
query_count!(bp_clickhouse_keeper_zone_id_to_node_id, blueprint_id), | ||
query_count!(bp_clickhouse_server_zone_id_to_node_id, blueprint_id), | ||
query_count!(bp_oximeter_read_policy, blueprint_id), | ||
query_count!(bp_pending_mgs_update_sp, blueprint_id), | ||
query_count!(bp_pending_mgs_update_rot, blueprint_id), | ||
query_count!(bp_pending_mgs_update_rot_bootloader, blueprint_id), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we want to commonize most of this function with ensure_blueprint_fully_deleted()
. Maybe we want a function that just counts the rows in each of these tables and returns whether they're all zero or all non-zero or something in between? Then call that from here and ensure_blueprint_fully_deleted()
?
That doesn't allow us flexibility when we really do have an exception. For example, some combinations might be mutually exclusive and we'd need to be able to say it's okay if this table or that table has no rows. The function could return a struct of counters and the caller could decide what it expects? It'd probably be useful to have helpers fn all_empty(&self) -> bool
, fn all_non_empty() -> bool
, fn empty_tables() -> Vec<String>
(returns a list of the empty tables), and fn non_empty_tables() -> Vec<String>
so that callers could implement exceptions without having to check all the non-exceptional cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you David! This approach is more flexible and much cleaner indeed!
// Pending MGS update tables are only populated when the blueprint includes firmware updates. | ||
// The representative blueprint doesn't, so allow zero rows for these tables. | ||
if matches!( | ||
table_name, | ||
"bp_pending_mgs_update_sp" | ||
| "bp_pending_mgs_update_rot" | ||
| "bp_pending_mgs_update_rot_bootloader" | ||
) { | ||
tables_checked.insert(table_name); | ||
continue; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the representative collection did have some kind of update. If not, shouldn't it? Maybe it could have one of each type (probably with different baseboard ids)? Our planner would never do that but I don't believe it's actually invalid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just checked and the representative blueprint seems to currently not having firmware updates. It makes sense to add one of each type with different baseboard IDs. Thank you! 🙂
// ----------------------------------------------------------------- | ||
// Populate minimal Clickhouse cluster configuration so the blueprint | ||
// exercises the `bp_clickhouse_*` tables (issue #8455). | ||
// ----------------------------------------------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we don't use these // --------
comments.
// Populate minimal Clickhouse cluster configuration so the blueprint | ||
// exercises the `bp_clickhouse_*` tables (issue #8455). | ||
// ----------------------------------------------------------------- | ||
if blueprint.clickhouse_cluster_config.is_none() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm reluctant to change the representative blueprint to include multi-node Clickhouse because right now this is not the configuration we use outside of development. I'd rather we be testing the one we use in production. This example is what led to my comment elsewhere that the caller might want to decide if some tables should just be exceptions. This way, the main representative blueprint could not have this in it, the test could verify that these are the only empty tables, and then if we want we could have a different blueprint that does have multi-node clickhouse in it and make sure it has the values we expect in the tables (and that those get cleaned up when it gets deleted). Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That make sense, thank you! I'll remove the Clickhouse changes and rely on the table checking exceptions.
Create BlueprintTableCounts struct and refactor both ensure functions to use it. Remove ClickHouse config from representative blueprint.
Add ensure_blueprint_fully_populated() to check all bp_* tables get populated by representative blueprint. Complements existing ensure_blueprint_fully_deleted().
Add ClickHouse config to representative blueprint.
Fixes #8455