-
Notifications
You must be signed in to change notification settings - Fork 500
MCO-1748: Proposed updates to MCN in 4.20 for Status Reporting
#1809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
MCO-1748: Proposed updates to MCN in 4.20 for Status Reporting
#1809
Conversation
@isabella-janssen: This pull request references MCO-1748 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.20.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@isabella-janssen: This pull request references MCO-1748 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.20.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
5f153cc
to
12f96ab
Compare
|
||
The desired config found in the spec will get updated immediately when a new config is found on the node. However, the desired config found in the status will only get updated once the new config has been validated in the machine config daemon. In the current implementation, the desired config is populated in the status by checking whether the update successfully gets past the "UpdatePrepared" phase. If the "UpdatePrepared" phase succeeds, then the status can safely add the desired config. | ||
<!-- TODO: check on the "the image has been successfully created" condition for the deire image being in the status --> | ||
The desired config or image found in `Spec` will get updated immediately when a new config or image is found on the node. However, the desired config and image found in `Status` will only get updated once the new config has been validated in the MCD or the image has been successfully created. In the current implementation, the desired config is populated in the status by checking whether the update successfully gets past the "UpdatePrepared" phase. If the "UpdatePrepared" phase succeeds, then the status can safely add the desired config. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question to Reviewer
A standard update flow should progress as follows:
- Create rendered MC
- Update desired config value in MCN's
Spec
- Validate new rendered MC
- Update desired config value in MCN's
Status
My question is whether OCL has a similar "validation" step after setting the machineconfiguration.openshift.io/desiredImage
annotation on the node and before the update to that value can occur? If not, does it make sense to have a desired image set in Spec
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify the question, are you asking if we validate the current state of the node on the previous config? Or whether we validate the incoming desiredImage to see if it's valid or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latter, whether we validate the incoming desiredImage to see if it's valid or not. My intent behind this question is to understand if there is some validation post setting the desired image annotation on the node and when the update can actually proceed, similar to how MCN has the reconcile and other checks before UpdatePrepared
is flipped to True
.
There are three types of conditions in MCN: | ||
- Parent | ||
- These conditions track the overall arc of an upgrade. | ||
- Includes `UpdatePrepared`, `UpdateExecuted`, `UpdatePostActionComplete`, `RebootedNode`, `Resumed`, `UpdateComplete`, and `Updated`. | ||
- Child | ||
- These conditions are phases that occur within the overarching parent phases. | ||
- Includes `Drained`, `AppliedFilesAndOS`, `Cordoned`, and `Uncordoned`. | ||
- In 4.19, this includes `Drained`, `AppliedFilesAndOS`, `Cordoned`, and `Uncordoned`. | ||
- In 4.20, this additionally includes `ImagePulledFromRegistry`, `AppliedOSImage`, and `AppliedFiles`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to Reviewer
I'm not sure of the feasibility of "replacing" the currently existing AppliedFilesAndOS
condition with new AppliedOSImage
and AppliedFiles
conditions. I think the best course forward with this idea is simply adding AppliedOSImage
and AppliedFiles
in 4.20 and later we can think about phasing out use of AppliedFilesAndOS
if it makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this answers my earlier question. I think it's fine since we should be still able to modify conditions, but I agree that this old one would be redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could easily be convinced to just keep AppliedFilesAndOS
as is instead of migrating to them being split. I have no strong opinions either way, but it was a recommendation presented by @cheesesashimi that I think is worth getting perspectives from the team on.
9608341
to
fbbc56c
Compare
@isabella-janssen: This pull request references MCO-1748 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.20.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Config Image: | ||
Desired: image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/ocb-image@sha256:6600f777a1d8b3b5be31f483189b5dc813799fe45bb2ba18b5742b58e27e9387 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question to reviewer
Do we only want to add OCL-specific values in the MCN once OCL is enabled? I'd think yes, at least for the ConfigImage
fields, but open to other perspectives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd be +1 to having OCL fields be optional and only exist if the pool the node belongs to is doing OCL
classDef Phase font-weight:bold,fill:#bbbbbb,stroke:#000,color:#000 | ||
``` | ||
|
||
*Before an update is triggered, UPDATED will be True and all other statuses will be False.* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to reviewer
That the following console outputs are what I see as the possible MCN updates during an OCL update, but I would like reviews on the flow from someone with better knowledge of OCL to chime in.
6c33db1
to
b56df3e
Compare
@isabella-janssen: This pull request references MCO-1748 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.20.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some general comments inline
Config Image: | ||
Desired: image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/ocb-image@sha256:6600f777a1d8b3b5be31f483189b5dc813799fe45bb2ba18b5742b58e27e9387 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd be +1 to having OCL fields be optional and only exist if the pool the node belongs to is doing OCL
|
||
The desired config found in the spec will get updated immediately when a new config is found on the node. However, the desired config found in the status will only get updated once the new config has been validated in the machine config daemon. In the current implementation, the desired config is populated in the status by checking whether the update successfully gets past the "UpdatePrepared" phase. If the "UpdatePrepared" phase succeeds, then the status can safely add the desired config. | ||
<!-- TODO: check on the "the image has been successfully created" condition for the deire image being in the status --> | ||
The desired config or image found in `Spec` will get updated immediately when a new config or image is found on the node. However, the desired config and image found in `Status` will only get updated once the new config has been validated in the MCD or the image has been successfully created. In the current implementation, the desired config is populated in the status by checking whether the update successfully gets past the "UpdatePrepared" phase. If the "UpdatePrepared" phase succeeds, then the status can safely add the desired config. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify the question, are you asking if we validate the current state of the node on the previous config? Or whether we validate the incoming desiredImage to see if it's valid or not?
There are three types of conditions in MCN: | ||
- Parent | ||
- These conditions track the overall arc of an upgrade. | ||
- Includes `UpdatePrepared`, `UpdateExecuted`, `UpdatePostActionComplete`, `RebootedNode`, `Resumed`, `UpdateComplete`, and `Updated`. | ||
- Child | ||
- These conditions are phases that occur within the overarching parent phases. | ||
- Includes `Drained`, `AppliedFilesAndOS`, `Cordoned`, and `Uncordoned`. | ||
- In 4.19, this includes `Drained`, `AppliedFilesAndOS`, `Cordoned`, and `Uncordoned`. | ||
- In 4.20, this additionally includes `ImagePulledFromRegistry`, `AppliedOSImage`, and `AppliedFiles`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this answers my earlier question. I think it's fine since we should be still able to modify conditions, but I agree that this old one would be redundant.
a09bf4f
to
38ac165
Compare
@isabella-janssen: This pull request references MCO-1748 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.20.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
…e 'status reporting' in the MCO
5a32f6b
to
dfe00ab
Compare
@isabella-janssen: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
@isabella-janssen: This pull request references MCO-1748 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.20.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
This outlines the proposed changes to the MachineConfigNode (MCN) resource in 4.20 to finalize "Status Reporting" in the MCO. The primary change for 4.20 is reporting the status through OnClusterLayering-enabled node updates.
Note that since the 4.19 updates to the MCN enhancement have not yet merged (PR #1765), only the most recent commit, dfe00ab, is relevant to 4.20.