-
Notifications
You must be signed in to change notification settings - Fork 192
Fail if node cannot join the cluster because a <host>.node-password.rke2 secret already exists in cluster #335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
3dc105b
8917178
41bb39b
ae322a7
f519b6d
e061595
b619d09
294190d
e15630f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we should not delete this file and put here logic with deleting secrets and diffed nodes like in previous deleted parts, and execute it if specific variables are set. This will cover all the cases like before in |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,13 +5,22 @@ | |
args: | ||
executable: /bin/bash | ||
changed_when: false | ||
register: node_names | ||
register: registered_node_names | ||
|
||
- name: Restore etcd - remove old nodes | ||
- name: Restore etcd - cleanup <node>.node-password.rke2 secrets | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. please delete the trailing whitespace https://github.com/lablabs/ansible-role-rke2/actions/runs/16873948643/job/49236826242 |
||
ansible.builtin.shell: | | ||
{{ rke2_data_path }}/bin/kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml \ | ||
delete secret {{ item }}.node-password.rke2 -n kube-system 2>&1 || true | ||
args: | ||
executable: /bin/bash | ||
with_items: "{{ registered_node_names.stdout_lines }}" | ||
when: item != rke2_node_name | ||
|
||
- name: Restore etcd - remove old (not existing) nodes | ||
ansible.builtin.shell: | | ||
{{ rke2_data_path }}/bin/kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml \ | ||
delete node {{ item }} 2>&1 || true | ||
args: | ||
executable: /bin/bash | ||
with_items: "{{ node_names.stdout_lines | difference(groups[rke2_cluster_group_name]) }}" | ||
with_items: "{{ registered_node_names.stdout_lines | difference(groups[rke2_cluster_group_name]) }}" | ||
changed_when: false |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this task should stay and should be triggered if another switch exists |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -46,6 +46,46 @@ | |
when: (rke2_custom_registry_mirrors | length > 0 or rke2_custom_registry_configs | length > 0) | ||
notify: "Config file changed" | ||
|
||
- name: Get current nodes secrets | ||
delegate_to: "{{ active_server }}" | ||
run_once: true | ||
block: | ||
- name: Get list of existing node secrets | ||
ansible.builtin.shell: | | ||
set -o pipefail | ||
"{{ rke2_data_path }}/bin/kubectl" --kubeconfig /etc/rancher/rke2/rke2.yaml \ | ||
get secrets -n kube-system -o jsonpath="{.items[*].metadata.name}" | tr ' ' '\n' | grep -E 'node-password\.rke2$' | sed 's/\.node-password\.rke2//g' | ||
args: | ||
executable: /bin/bash | ||
register: nodes_with_passwords # A node name on each line | ||
changed_when: false | ||
- name: Set fact for existing node passwords | ||
ansible.builtin.set_fact: | ||
nodes_with_existing_passwords: "{{ nodes_with_passwords.stdout_lines }}" | ||
|
||
|
||
- name: Validate presence of node password file when secret exists | ||
block: | ||
- name: Register if rke2 password file exists | ||
ansible.builtin.stat: | ||
path: /etc/rancher/node/password | ||
register: node_password_file | ||
|
||
- name: Fail if the cluster already has a <hostname>.node-password.rke2 secret and the node doesn't have a password file | ||
ansible.builtin.fail: | ||
msg: | | ||
The node password secret already exists for node name {{ rke2_node_name }}, but no password file exists in /etc/rancher/node/password! | ||
The node will not be able to join the cluster with this node name without a password file matching the secret. | ||
This can happen for a few reasons: | ||
- The node was previously part of the cluster and RKE2 was removed without running `kubectl delete node {{ rke2_node_name }}`. | ||
- The cluster etcd was restored from a backup from before the node was correctly removed from the cluster. | ||
To join this node, please recreate the file with the password, use a different node name (rke2_node_name), or remove the secret from the cluster using: | ||
kubectl delete secret {{ rke2_node_name}}.node-password.rke2 -n kube-system | ||
when: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. please delete the trailing whitespace https://github.com/lablabs/ansible-role-rke2/actions/runs/16873948643/job/49236826242 |
||
- rke2_node_name in nodes_with_existing_passwords | ||
- not node_password_file.stat.exists | ||
|
||
- name: Start RKE2 service on the rest of the nodes | ||
ansible.builtin.systemd: | ||
name: "{{ rke2_service_name }}" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add this new variable also to the README.md file and to the argument_specs.yml