Skip to content

Conversation

thiyyakat
Copy link
Member

@thiyyakat thiyyakat commented Sep 22, 2025

What this PR does / why we need it:

This PR introduces a proposal to support the temporary preservation of machines.

The use cases have been discussed here.

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Release note:

Added proposal for the temporary preservation of machines.

@thiyyakat thiyyakat requested a review from a team as a code owner September 22, 2025 09:51
@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Sep 22, 2025
@gardener-robot gardener-robot added the needs/review Needs review label Sep 22, 2025
@thiyyakat thiyyakat force-pushed the proposal/failed-machine-preserve branch from 20d8f8f to 9fbdb30 Compare September 22, 2025 09:52
@gardener-robot gardener-robot added the size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) label Sep 22, 2025
* There is a configurable limit to the number of `Failed` machines that can be preserved
* There is a configurable limit to the duration for which such machines are preserved
* Users can specify which healthy machines they would like to preserve in case of failure
* Users can request MCM to delete a preserved `Failed` machine, even before the timeout expires
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Users can request MCM to delete a preserved `Failed` machine, even before the timeout expires
* Users can request MCM to release a preserved `Failed` machine, even before the timeout expires, so that MCM can transition the machine to `Terminating` phase and trigger its deletion.

* Users can specify which healthy machines they would like to preserve in case of failure
* Users can request MCM to delete a preserved `Failed` machine, even before the timeout expires

## Solution Design
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Solution Design
## Proposal

```
* Since gardener worker pool can correspond to `1..N` MachineDeployments depending on number of zones, `failedMachinePreserveMax` will be distributed across N machine deployments.
* `failedMachinePreserveMax` must be chosen such that it can be appropriately distributed across the MachineDeployments.
2. Allow user/operator to explicitly request for preservation of a machine if it moves to `Failed` phase with the use of an annotation : `node.machine.sapcloud.io/preserve-when-failed=true`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allow user/operator to explicitly request for preservation of a machine if it moves to Failed phase with the use of an annotation

If the machine already moves to FAILED state and if there is still capacity to preserve MCM will automatically preserve it and if there is no capacity to reserve then it will swiftly move it to termination and trigger its deletion, giving user no chance to influence. I think you meant provide a user an option to preserve specific machines which are not yet in FAILED state, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Will reword and disambiguate.

* For a machine thus annotated, MCM will move it to `Terminating` phase even if `failedMachinePreserveTimeout` has not expired.
3. If an un-annotated machine moves to `Failed` phase, and the `failedMachinePreserveMax` has not been reached, MCM will auto-preserve this machine.
4. MCM will be modified to introduce a new stage in the `Failed` phase: `machineutils.PreserveFailed`, and a failed machine that is preserved by MCM will be transitioned to this stage after moving to `Failed`.
* In this new stage, pods can be evicted and scheduled on other healthy machines, and the user/operator can wait for the corresponding VM to potentially recover. If the machine moves to `Running` phase on recovery, new pods can be scheduled on it. It is yet to be determined whether this feature will be required.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be reworded and brought out as this is an open question.
Open point:
Should MCM provide an ability to the consumer to trigger drain on the preserved FAILED machine?

(Running + Requested)
├── [Machine fails + capacity available] → (PreserveFailed)
├── [Machine fails + no capacity] → Failed → Terminating
└── [User removes `node.machine.sapcloud.io/preserve-when-failed=true`] → (Running)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not clear why User removes node.machine.sapcloud.io/preserve-when-failed=true] → (Running) is here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For completeness. If the annotation is removed from a healthy machine, the state transitions to (Running) from (Running+Requested), making it a candidate for auto-preservation by MCM on failure, and also allowing users to re-annotate.

2. Machine fails later
3. MCM preserves the machine (if capacity allows)
4. Operator analyzes the failed VM
5. Operator releases the failed machine by setting `node.machine.sapcloud.io/preserve-when-failed=false` on the node object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case only talks about Operator suspects a machine might fail and wants to ensure preservation for analysis.
Should it also include explicit release of the preserved-failed machine? Is that not covered in Use Case 4 (early release)?

Copy link

@etiennnr etiennnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to reach out to me directly for precision!

Comment on lines +15 to +16
Currently, the Machine Controller Manager(MCM) moves Machines with errors to the `Unknown` phase, and after the configured `machineHealthTimeout`, to the `Failed` phase.
`Failed` machines are swiftly moved to the `Terminating` phase during which the node is drained and the `Machine` object is deleted. This rapid cleanup prevents SRE/operators/support from conducting an analysis on the VM and makes finding root cause of failure more difficult.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TLDR: I would not restrict this feature to failing node.

Eg. it can happen that we detect problems with essentially all pods on a node but that this node doesn't report any condition failures (aka the node/machine will not be in failed state).

From an SRE perspective, we want to be as available as possible. Thus, in these kind of cases, we would cordon / drain the pods (except daemonSets) and start investigating the node. Furthermore, since expertise is spread around the globe, we sometimes need to keep a node in a cordoned state for 24-28 hours in order to investigate the root cause with the right area's expert. However, if a node is cordoned with not workload on it, it has very high chances to be scheduled for scale down by CA first.

Thus, this feature should also work for non failing nodes in order to cover all cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will extend the feature to preserve non-failing nodes as well. Will update the proposal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel, we should not generalize non-failing node, as it will introduce more complexity to the state management. We should only enable for cases where the node is cordoned and drained, also I'm guessing we need to do some changes in CA handling for unneeded machines as well as I don't see mentioned in this proposal.


This document proposes enhancing MCM, such that:
* VMs of `Failed` machines are retained temporarily for analysis
* There is a configurable limit to the number of `Failed` machines that can be preserved

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, it's rare that many nodes need to be investigated. Since configuration is pretty standard across all node of a same worker group, investigating 1 node is usually enough. That being said, I don't have anything against this protection!

```
machineControllerManager:
failedMachinePreserveMax: 2
failedMachinePreserveTimeout: 3h

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd put the default of that to 48 or 72 hours as a default. We require expertise from around the world. We also have to include weekends. Since this is going to be a setting shoot-wide, setting it to 3h would in many cases render this feature useless since we can't really change settings in the shoot yaml without the shoot-owner approval.

That being said, if the shoot-owner chooses to set a low amount like 3h on purpose, well then they made the choice to limit support on problematic nodes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll also agree that if we are preserving we preserve it for duration of 72hrs as default for preserveTimeout and PreserveMax to be 1 as default if set.
We should also make it explicit that this setting is per worker pool in the doc.

```
* Since gardener worker pool can correspond to `1..N` MachineDeployments depending on number of zones, `failedMachinePreserveMax` will be distributed across N machine deployments.
* `failedMachinePreserveMax` must be chosen such that it can be appropriately distributed across the MachineDeployments.
2. Allow user/operator to explicitly request for preservation of a specific machine with the use of an annotation : `node.machine.sapcloud.io/preserve-when-failed=true`, such that, if it moves to `Failed` phase, the machine is preserved by MCM, provided there is capacity.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making the annotation be available on both machine and node object is important IMHO, so that any shoot-operator can investigate a failing node by themselves (self troubleshoot)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have it at machine level at all, if you plan to have it at node level?
Won't just having it at node level suffice, like any other annotation we expose for the end user to play with the nodes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all machine's will have backing Node. So if there is a machine that has not initialized properly to become a K8S Node and if you wish to wish to preserve the backing VM then a provision should be provided to annotate the machine instead. Ofcourse this can only be done by an operator who has access to the Machine objects.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

* Since gardener worker pool can correspond to `1..N` MachineDeployments depending on number of zones, `failedMachinePreserveMax` will be distributed across N machine deployments.
* `failedMachinePreserveMax` must be chosen such that it can be appropriately distributed across the MachineDeployments.
2. Allow user/operator to explicitly request for preservation of a specific machine with the use of an annotation : `node.machine.sapcloud.io/preserve-when-failed=true`, such that, if it moves to `Failed` phase, the machine is preserved by MCM, provided there is capacity.
3. MCM will be modified to introduce a new stage in the `Failed` phase: `machineutils.PreserveFailed`, and a failed machine that is preserved by MCM will be transitioned to this stage after moving to `Failed`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not call the status Preserving instead of PreserveFailed (aka as I said above, this would be useful for machines / nodes not detected as failed too)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we need a new state diagram here -- Healthy --> Perserved as well Failed --> Preserved.
I'm guessing removal of preserved should bring back the machine to either failed or healthy state as it be the case at the point in time ?

Comment on lines 84 to 89
### Use Case 2: Automatic Preservation
**Scenario:** Machine fails unexpectedly, no prior annotation.
#### Steps:
1. Machine transitions to `Failed` phase
2. If `failedMachinePreserveMax` is not breached, machine moved to `PreserveFailed` phase by MCM
3. After `failedMachinePreserveTimeout`, machine is terminated by MCM

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be risky of having many nodes in PreserveFailed mainly if we use a relatively high failedMachinePreserveTimeout, even with failedMachinePreserveMax 'cause then a failing worker pool could make multiple nodes getting in PreserveFailed. And depending on how MCM chooses which node to keep / replace in PreserveFailed, could have undesired side effects.

Maybe make that available as an option in the shoot yaml (but defaults to false)?

Comment on lines 91 to 96
### Use Case 3: Capacity Management
**Scenario:** Multiple machines fail when preservation capacity is full.
#### Steps:
1. Machines M1, M2 already preserved (failedMachinePreserveMax = 2)
2. Operator wishes to preserve M3 in case of failure. He increases `failedMachinePreserveMax` to 3, and annotates M3 with `node.machine.sapcloud.io/preserve-when-failed=true`.
3. If M3 fails, machine moved to `PreserveFailed` phase by MCM.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI changing values directly in the shoot YAML is not always a possibility for operators (E.g. no permissions to edit the shoot YAML (but have admin permission in the cluster). That being said, this is also a valid use case

Copy link

@ezadik ezadik Sep 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thiyyakat
I thought that the values of the "failedMachinePreserveMax" and "failedMachinePreserveTimeout" can be defined in yaml per worker-group?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thiyyakat I thought that the values of the "failedMachinePreserveMax" and "failedMachinePreserveTimeout" can be defined in yaml per worker-group?

Yes, it will be defined per worker pool.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This again makes me substantiate that we don't need maxPreservedMachine as this will be done outside of shoot spec and will play conflict.
We can have it as defaultPreserveMachineCount from 1 to any value the shoot owner decides.

The operator can override this value by directly annotating the machine for preservation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that is the case then, we can simplify this functionality and completely remove preserveMachineMax or what its currently called failedMachinePreserveMax (proposal has been iterated several times now but not yet reflected here).
The user can just use annotations on Node/Machine to indicate machines that they would like to preserve.
But do remember that it is now the responsibility of whosoever adds the annotations to keep in mind that annotated machines are counted as part of MCD replicas. This means if they do not pay attention and mark multiple machines to be preserved then availability of machines in a MCD can potentially get impacted. The onus of that will solely lie on the operator who made this change.


## Open Point

How will MCM provide the user with the option to drain a node when it is in `PreserveFailed` stage?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kubectl drain? I don't see this as something that should be done my MCM IMHO. The exact purpose of this feature is to be able to test a node even if it's failing, thus workload is sometimes needed in order to troubleshoot. I'd add a warning instead in the documentation to do the drain yourself if it's required.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather make it a pre-requisite to mark a machine to be preserved only if it has been drained/scheduling disabled. This will ensure that removal of preserve state directly moves the machine to terminating phase removing unnecessary state mgmt. wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather make it a pre-requisite to mark a machine to be preserved only if it has been drained/scheduling disabled.

Not clear. Today MCM will only drain the machine when it reaches a FAILED state. Once it reaches that state then apart from cases where the drain takes a long time the transition to Terminating and removal of the VM is pretty quick. So your suggestion will then only apply for machines that have not transitioned to FAILED state and are manually drained.

But why should preservation of a machine be tied to draining?
In subsequent discussions the following was agreed upon:

  • If the machine is not in FAILED state and has preserve annotation, then MCM will never drain these nodes. It is upto the operator to take the call if a drain is required and then use kubectl drain command to do that.
  • If the machine has transitioned to FAILED state then MCM can drain the node. Today it does when a VM transitions to Terminating phase but there is hardly any time difference between FAILED and Terminating, that is the reason we proposed that draining can also be tied to FAILED machines only.

However there was also a question what happens when a VM of a machine in FAILED is rebooted from the CSP console? If it becomes healthy now will it go back to Running? Today that is not the case but since we are preserving FAILED machines this transition now becomes possible. Keeping that in mind maybe we do not have to drain at all, unless it reaches Terminating state (the current behavior).

Comment on lines 114 to 115
2. Since gardener worker pool can correspond to 1..N MachineDeployments depending on number of zones, we will need to distribute the `failedMachinePreserveMax` across N machine deployments.
So, even if there are no failed machines preserved in other zones, the max per zone would still be enforced. Hence, the value of `failedMachinePreserveMax` should be chosen appropriately.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing to me, but I might not understand the details also... Do you mean that failedMachinePreserveMax is global to all machines / machineDeployment but the maximum amount of nodes in a given MD still takes precedence ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Similar to the way Max for worker pools is enforced. I will update the proposal with an example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll propose to remove the concept of maxPreserve ensure default is set to 1 and anything more than 1 needs to be done manually by the operator as just perserveMachineCount.
As this is not a standard operation we don't need to have additional logic of max/min.
If timeout is set then default is set to 1 per machine deployment, which can be increased by the operator as the case be.

Copy link
Contributor

@ashwani2k ashwani2k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the proposal. I've put some comments, while reading it I felt we have not considered all the personas who can consume this feature.

  1. Operators who will manipulate the machine/node objects.
  2. Stakholders who will maniuplate the shoot spec.
  3. Stakeholder who will manipulate the node objects.

Another dimension which I don't see mentioned is how is this state conveyed as part of shoot status and the dashboard.

```
machineControllerManager:
failedMachinePreserveMax: 2
failedMachinePreserveTimeout: 3h
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll also agree that if we are preserving we preserve it for duration of 72hrs as default for preserveTimeout and PreserveMax to be 1 as default if set.
We should also make it explicit that this setting is per worker pool in the doc.

failedMachinePreserveTimeout: 3h
```
* Since gardener worker pool can correspond to `1..N` MachineDeployments depending on number of zones, `failedMachinePreserveMax` will be distributed across N machine deployments.
* `failedMachinePreserveMax` must be chosen such that it can be appropriately distributed across the MachineDeployments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be agnostic to the end user. A value of 1 is spread to all machineDeployments and the distribution is implicitly taken. The end user doesn't need to know the semantics of setting he/she just sets how many machines they would like to preserve. If its 1 then 1 machine per zone will be preserved implicitly. If its 2 then 2 machines per zone will be preserved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If its 1 then 1 machine per zone will be preserved implicitly. If its 2 then 2 machines per zone will be preserved.

The convention in gardener is that min/max values for the worker pool are distributed across zones. So we would be breaking the current convention, if we say that failedMachinePreserveMax is per zone of the worker pool, instead of across the worker pool.

```
* Since gardener worker pool can correspond to `1..N` MachineDeployments depending on number of zones, `failedMachinePreserveMax` will be distributed across N machine deployments.
* `failedMachinePreserveMax` must be chosen such that it can be appropriately distributed across the MachineDeployments.
2. Allow user/operator to explicitly request for preservation of a specific machine with the use of an annotation : `node.machine.sapcloud.io/preserve-when-failed=true`, such that, if it moves to `Failed` phase, the machine is preserved by MCM, provided there is capacity.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have it at machine level at all, if you plan to have it at node level?
Won't just having it at node level suffice, like any other annotation we expose for the end user to play with the nodes.

* Since gardener worker pool can correspond to `1..N` MachineDeployments depending on number of zones, `failedMachinePreserveMax` will be distributed across N machine deployments.
* `failedMachinePreserveMax` must be chosen such that it can be appropriately distributed across the MachineDeployments.
2. Allow user/operator to explicitly request for preservation of a specific machine with the use of an annotation : `node.machine.sapcloud.io/preserve-when-failed=true`, such that, if it moves to `Failed` phase, the machine is preserved by MCM, provided there is capacity.
3. MCM will be modified to introduce a new stage in the `Failed` phase: `machineutils.PreserveFailed`, and a failed machine that is preserved by MCM will be transitioned to this stage after moving to `Failed`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we need a new state diagram here -- Healthy --> Perserved as well Failed --> Preserved.
I'm guessing removal of preserved should bring back the machine to either failed or healthy state as it be the case at the point in time ?


## Open Point

How will MCM provide the user with the option to drain a node when it is in `PreserveFailed` stage?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather make it a pre-requisite to mark a machine to be preserved only if it has been drained/scheduling disabled. This will ensure that removal of preserve state directly moves the machine to terminating phase removing unnecessary state mgmt. wdyt?

Comment on lines 114 to 115
2. Since gardener worker pool can correspond to 1..N MachineDeployments depending on number of zones, we will need to distribute the `failedMachinePreserveMax` across N machine deployments.
So, even if there are no failed machines preserved in other zones, the max per zone would still be enforced. Hence, the value of `failedMachinePreserveMax` should be chosen appropriately.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll propose to remove the concept of maxPreserve ensure default is set to 1 and anything more than 1 needs to be done manually by the operator as just perserveMachineCount.
As this is not a standard operation we don't need to have additional logic of max/min.
If timeout is set then default is set to 1 per machine deployment, which can be increased by the operator as the case be.

Comment on lines 91 to 96
### Use Case 3: Capacity Management
**Scenario:** Multiple machines fail when preservation capacity is full.
#### Steps:
1. Machines M1, M2 already preserved (failedMachinePreserveMax = 2)
2. Operator wishes to preserve M3 in case of failure. He increases `failedMachinePreserveMax` to 3, and annotates M3 with `node.machine.sapcloud.io/preserve-when-failed=true`.
3. If M3 fails, machine moved to `PreserveFailed` phase by MCM.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This again makes me substantiate that we don't need maxPreservedMachine as this will be done outside of shoot spec and will play conflict.
We can have it as defaultPreserveMachineCount from 1 to any value the shoot owner decides.

The operator can override this value by directly annotating the machine for preservation.

Comment on lines +15 to +16
Currently, the Machine Controller Manager(MCM) moves Machines with errors to the `Unknown` phase, and after the configured `machineHealthTimeout`, to the `Failed` phase.
`Failed` machines are swiftly moved to the `Terminating` phase during which the node is drained and the `Machine` object is deleted. This rapid cleanup prevents SRE/operators/support from conducting an analysis on the VM and makes finding root cause of failure more difficult.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel, we should not generalize non-failing node, as it will introduce more complexity to the state management. We should only enable for cases where the node is cordoned and drained, also I'm guessing we need to do some changes in CA handling for unneeded machines as well as I don't see mentioned in this proposal.

@thiyyakat thiyyakat marked this pull request as draft October 3, 2025 07:48
@thiyyakat thiyyakat changed the title Add proposal for temporary preservation of Failed machines for diagnostics [WIP] Add proposal for temporary preservation of Failed machines for diagnostics Oct 3, 2025
@thiyyakat thiyyakat changed the title [WIP] Add proposal for temporary preservation of Failed machines for diagnostics [WIP] Add proposal for temporary preservation of machines Oct 3, 2025
@thiyyakat thiyyakat force-pushed the proposal/failed-machine-preserve branch from 0b621a2 to 849a99d Compare October 8, 2025 14:29
@thiyyakat thiyyakat marked this pull request as ready for review October 8, 2025 14:31
@thiyyakat thiyyakat changed the title [WIP] Add proposal for temporary preservation of machines Add proposal for temporary preservation of machines Oct 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs/changes Needs (more) changes needs/review Needs review reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) size/m Size of pull request is medium (see gardener-robot robot/bots/size.py)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants