Skip to content

Conversation

@2uasimojo
Copy link
Member

@2uasimojo 2uasimojo commented Nov 13, 2025

For purposes of this work, there are three kinds of clusters:

  • IPI, where hive manages the provisioning via installer.
  • UPI, where a ClusterInstall implementation provisions the cluster and hive just watches and copies over the status.
  • Fake, used for testing purposes.

When we implemented HIVE-2302 / #2729, we didn't account for UPI/ClusterInstall, which we don't really have a good way to test, and ended up injecting a bug:

Via that effort, we started populating a new Secret containing metadata.json produced by the installer. For legacy clusters (those that existed before upgrading to a version with this feature) we need to retrofit that Secret based on the ClusterMetadata (among other things), which was previously how we saved off the metadata.json. For IPI, that ClusterMetadata always had a Platform section. The changes we had to make for fake clusters saw us spoofing a very sparse metadata.json and then populating it later. That process relied on the existence of the CD.Spec.ClusterMetadata.Platform section, so we were creating it for the sake of that fake cluster path. However, it turns out that ClusterInstall subclasses don't (and don't need to) populate the ClusterMetadata.Platform section. Since we blindly copy the ClusterMetadata into the ClusterDeployment, we could end up with that Platform section being absent when we come to retrofit the metadata.json. We would then hit the path designed for fake clusters where we would create that section (empty). No problem, right? Except that we have a validating admission webhook that forbids making changes to the ClusterMetadata section, and that new, empty Platform field was flagged as such a change, and bounced by the webhook.

Phew.

So with this change, we condition populating that Platform section explicitly on fake clusters only.

For purposes of this work, there are three kinds of clusters:
- IPI, where hive manages the provisioning via installer.
- UPI, where a ClusterInstall implementation provisions the cluster and
  hive just watches and copies over the status.
- Fake, used for testing purposes.

When we implemented HIVE-2302 / openshift#2729, we didn't account for
UPI/ClusterInstall, which we don't really have a good way to test, and
ended up injecting a bug:

Via that effort, we started populating a new Secret containing
metadata.json produced by the installer. For legacy clusters (those that
existed before upgrading to a version with this feature) we need to
retrofit that Secret based on the ClusterMetadata (among other things),
which was previously how we saved off the metadata.json. For IPI, that
ClusterMetadata always had a Platform section. The changes we had to
make for fake clusters saw us spoofing a very sparse metadata.json and
then populating it later. That process relied on the existence of the
CD.Spec.ClusterMetadata.Platform section, so we were creating it for the
sake of that fake cluster path. However, it turns out that
ClusterInstall subclasses don't (and don't need to) populate the
ClusterMetadata.Platform section. Since we blindly copy the
ClusterMetadata into the ClusterDeployment, we could end up with that
Platform section being absent when we come to retrofit the
metadata.json. We would then hit the path designed for fake clusters
where we would create that section (empty). No problem, right? Except
that we have a validating admission webhook that forbids making changes
to the ClusterMetadata section, and that new, empty Platform field was
flagged as such a change, and bounced by the webhook.

Phew.

So with this change, we condition populating that Platform section
explicitly on fake clusters only.
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 13, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Nov 13, 2025

@2uasimojo: This pull request references ACM-26271 which is a valid jira issue.

In response to this:

For purposes of this work, there are three kinds of clusters:

  • IPI, where hive manages the provisioning via installer.
  • UPI, where a ClusterInstall implementation provisions the cluster and hive just watches and copies over the status.
  • Fake, used for testing purposes.

When we implemented HIVE-2302 / #2729, we didn't account for UPI/ClusterInstall, which we don't really have a good way to test, and ended up injecting a bug:

Via that effort, we started populating a new Secret containing metadata.json produced by the installer. For legacy clusters (those that existed before upgrading to a version with this feature) we need to retrofit that Secret based on the ClusterMetadata (among other things), which was previously how we saved off the metadata.json. For IPI, that ClusterMetadata always had a Platform section. The changes we had to make for fake clusters saw us spoofing a very sparse metadata.json and then populating it later. That process relied on the existence of the CD.Spec.ClusterMetadata.Platform section, so we were creating it for the sake of that fake cluster path. However, it turns out that ClusterInstall subclasses don't (and don't need to) populate the ClusterMetadata.Platform section. Since we blindly copy the ClusterMetadata into the ClusterDeployment, we could end up with that Platform section being absent when we come to retrofit the metadata.json. We would then hit the path designed for fake clusters where we would create that section (empty). No problem, right? Except that we have a validating admission webhook that forbids making changes to the ClusterMetadata section, and that new, empty Platform field was flagged as such a change, and bounced by the webhook.

Phew.

So with this change, we condition populating that Platform section explicitly on fake clusters only.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

1 similar comment
@openshift-ci-robot
Copy link

openshift-ci-robot commented Nov 13, 2025

@2uasimojo: This pull request references ACM-26271 which is a valid jira issue.

In response to this:

For purposes of this work, there are three kinds of clusters:

  • IPI, where hive manages the provisioning via installer.
  • UPI, where a ClusterInstall implementation provisions the cluster and hive just watches and copies over the status.
  • Fake, used for testing purposes.

When we implemented HIVE-2302 / #2729, we didn't account for UPI/ClusterInstall, which we don't really have a good way to test, and ended up injecting a bug:

Via that effort, we started populating a new Secret containing metadata.json produced by the installer. For legacy clusters (those that existed before upgrading to a version with this feature) we need to retrofit that Secret based on the ClusterMetadata (among other things), which was previously how we saved off the metadata.json. For IPI, that ClusterMetadata always had a Platform section. The changes we had to make for fake clusters saw us spoofing a very sparse metadata.json and then populating it later. That process relied on the existence of the CD.Spec.ClusterMetadata.Platform section, so we were creating it for the sake of that fake cluster path. However, it turns out that ClusterInstall subclasses don't (and don't need to) populate the ClusterMetadata.Platform section. Since we blindly copy the ClusterMetadata into the ClusterDeployment, we could end up with that Platform section being absent when we come to retrofit the metadata.json. We would then hit the path designed for fake clusters where we would create that section (empty). No problem, right? Except that we have a validating admission webhook that forbids making changes to the ClusterMetadata section, and that new, empty Platform field was flagged as such a change, and bounced by the webhook.

Phew.

So with this change, we condition populating that Platform section explicitly on fake clusters only.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from dlom and suhanime November 13, 2025 16:59
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 13, 2025
@2uasimojo
Copy link
Member Author

/assign @dlom

@dlom
Copy link
Contributor

dlom commented Nov 13, 2025

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 13, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 13, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: 2uasimojo, dlom

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 13, 2025

@2uasimojo: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@codecov
Copy link

codecov bot commented Nov 13, 2025

Codecov Report

❌ Patch coverage is 0% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 50.34%. Comparing base (47d2de1) to head (074127c).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
.../clusterdeployment/clusterdeployment_controller.go 0.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #2780   +/-   ##
=======================================
  Coverage   50.34%   50.34%           
=======================================
  Files         279      279           
  Lines       34167    34167           
=======================================
  Hits        17201    17201           
  Misses      15612    15612           
  Partials     1354     1354           
Files with missing lines Coverage Δ
.../clusterdeployment/clusterdeployment_controller.go 61.99% <0.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@2uasimojo
Copy link
Member Author

Konflux not reporting back successful runs. Reported here. Suggested workaround: force merge. Doing that.

@2uasimojo 2uasimojo merged commit fe9c66b into openshift:master Nov 13, 2025
8 of 22 checks passed
@2uasimojo 2uasimojo deleted the ACM-26271/clusterinstall-metadata-platform branch November 13, 2025 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants