-
Notifications
You must be signed in to change notification settings - Fork 138
Refactor hard-coded zuul
user in reproducer role
#3350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
02a1b24
to
55ebc64
Compare
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/a7738eda2d9e4d6b9ecbdac4b6cf2717 ✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 31m 14s |
663c195
to
9466520
Compare
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/34c153c78ef7406991c39d16c100ea72 ✔️ cifmw-pod-zuul-files SUCCESS in 4m 45s |
2f16f45
to
4822a60
Compare
zuul
user in reproducer role
# Default repositories we always want to have | ||
cifmw_reproducer_default_repositories: | ||
- src: "https://github.com/openstack-k8s-operators/ci-framework" | ||
dest: "{{ cifmw_project_dir_absolute }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using cifmw_project_dir_absolute
here does not always work as these dest
paths are for controller-0. For example the task reproducer : Clone repository if needed: ci-framework
will fail if the hypervisor and controller-0
have different users, as this task will try to create /home/<hypervisor_user_here>/src/github.com/openstack-k8s-operators/ci-framework
on controller-0 rather than /home/<controller_user_here>/src/github.com/openstack-k8s-operators/ci-framework
}} | ||
failed_when: false | ||
|
||
- name: Discover and expose CI Framework path on remote node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved this block to later in roles/reproducer/tasks/main.yml
in order to ensure that controller-0 exists before trying to set this fact. My refactor of cifmw_reproducer_default_repositories
in roles/reproducer/vars/main.yml
depends on knowing cifmw_reproducer_controller0_user
, so setting this fact before controller-0 exists means we can't pull the ansible_ssh_user
from controller-0's hostvars and the fact may be incorrect.
roles/reproducer/defaults/main.yml
Outdated
# All variables within this role should have a prefix of "cifmw_reproducer" | ||
cifmw_reproducer_user: "{{ ansible_user | default('zuul') }}" | ||
cifmw_reproducer_controller0_user: "{{ hostvars['controller-0']['ansible_ssh_user'] | default('zuul') }}" | ||
cifmw_reproducer_controller0_user_dir: "/home/{{ cifmw_reproducer_controller0_user }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enrique brought up the possibility of using something like $HOME
rather than using a long var like cifmw_reproducer_controller0_user_dir
- this would work in many cases, but some tasks will fail with this. For example:
TASK [sushy_emulator : Create and start Sushy Emulator container
...
task path: /home/testuser/ci-framework/roles/sushy_emulator/tasks/create_container.yml:26
fatal: [localhost -> controller-0(controller-0.localhost)]: FAILED! =>
changed: false
msg: Container cifmw-sushy_emulator exited with code 125 when runed
stderr: |
Error: creating named volume "$HOME/ci-framework-data/artifacts/sushy_emulator/.htpasswd": running volume create option: names must match [a-zA-Z0-9][a-zA-Z0-9_.-]*: invalid argument
stderr_lines:
- 'Error: creating named volume "$HOME/ci-framework-data/artifacts/sushy_emulator/.htpasswd":
running volume create option: names must match [a-zA-Z0-9][a-zA-Z0-9_.-]*: invalid
argument'
It would be possible to default to using $HOME
where possible, but we have to keep track of the controller0 user anyways and I figured it'd be good to be clear and consistent about using controller0_user_dir
- open to other arguments though!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this task as it's using the module: podman_container it doesn't expand env variables, so I think you need to use the lookup in this way: {{ lookup('env', 'HOME') }}
But I agree with you of consistenly using the same solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor correction: {{ lookup('env', 'HOME') }}
would not be looking up HOME
in controller-0's env because this task is delegated, so this would lookup HOME on the host that is being delegated from. Something like {{ hostvars['controller-0']['ansible_env']['HOME'] }}
works though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would also be possible to replace cifmw_reproducer_controller0_user: "{{ hostvars['controller-0']['ansible_ssh_user'] | default('zuul') }}"
with cifmw_reproducer_controller0_user: "{{ hostvars['controller-0']['ansible_env']['USER'] | default('zuul') }}
if that's in some way better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ow I see. Honestly I don't have any other suggestions than what evallesp has already suggested to you.
Do you just want to add a fallback for the _user_dir
?
cifmw_reproducer_controller0_user_dir: "{{ hostvars['controller-0']['ansible_env']['HOME'] | default('/home/' ~ cifmw_reproducer_controller0_user) }}"
roles/reproducer/tasks/main.yml
Outdated
cifmw_basedir: "{{ ansible_user_dir ~ '/ci-framework-data' }}" | ||
cifmw_basedir: "{{ cifmw_reproducer_controller0_basedir }}" | ||
ansible_user_dir: "{{ cifmw_reproducer_controller0_user_dir }}" | ||
ansible_user_id: "{{ cifmw_reproducer_controller0_user }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking for feedback on my decision to overwrite these vars here - This task delegates to controller-0 and ends up calling several other roles that configure things assuming they can use vars like ansible_user_dir
and ansible_user_id
, but this causes issues when there are different users between localhost and controller-0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(non-blocking) reflection: I'm not opposite to this solution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM in general, not approving to hold on until more people chime in
roles/reproducer/defaults/main.yml
Outdated
# All variables within this role should have a prefix of "cifmw_reproducer" | ||
cifmw_reproducer_user: "{{ ansible_user | default('zuul') }}" | ||
cifmw_reproducer_controller0_user: "{{ hostvars['controller-0']['ansible_ssh_user'] | default('zuul') }}" | ||
cifmw_reproducer_controller0_user_dir: "/home/{{ cifmw_reproducer_controller0_user }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this task as it's using the module: podman_container it doesn't expand env variables, so I think you need to use the lookup in this way: {{ lookup('env', 'HOME') }}
But I agree with you of consistenly using the same solution.
roles/reproducer/tasks/main.yml
Outdated
cifmw_basedir: "{{ ansible_user_dir ~ '/ci-framework-data' }}" | ||
cifmw_basedir: "{{ cifmw_reproducer_controller0_basedir }}" | ||
ansible_user_dir: "{{ cifmw_reproducer_controller0_user_dir }}" | ||
ansible_user_id: "{{ cifmw_reproducer_controller0_user }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(non-blocking) reflection: I'm not opposite to this solution
zuul
user in reproducer rolezuul
user in reproducer role
It was already technically possible to create a controller-0 vm with a non-zuul user. This commit enables configuring controller-0 with a non-zuul user by using controller-0's `ansible_ssh_user` from `hostvars` in place of hardcoded `'zuul'`. It is worth noting that we can't simply use vars like ansible_user_id or ansible_user_dir with tasks that are delgated to controller-0. If the host we are running the reproducer from has a different user from controller-0, these vars will try to use the user from the machine that is running the reproducer rather than controller-0's user. For example, let's say we are running the reproducer from a machine with the user 'exampleuser' and we have the user 'zuul' on controller-0. If we have some task that is delegated to controller-0 and tries to create some file in the controller-0 user's home directory, we can't simply do: ``` - name: Create some file delegate_to: controller-0 ansible.builtin.file: path: "{{ ansible_user_dir }}/some_file.txt" state: touch ``` `ansible_user_dir` resolves to `/home/exampleuser`. This path does not exist on controller-0, so this task would fail.
Previously, the check for `common-requirements.txt` would check if common requirements exists on localhost at a given path. This path is based on the home directory on localhost. If the file did exist, it would use that same path to try to install the requirements on controller-0. This causes issues when localhost and controller-0 do not have the same users and home directories.
The reproducer role used a hardcoded
zuul
user and hardcoded/home/zuul
path in several places, particularly when configuring controller-0. This patch pulls out this hardcoded data into default vars in the reproducer.This patch also fixes a few incorrect assumptions that a var like
ansible_user_id
can be used when delegating to or configuring controller-0. For example, theansible_user_id
will not resolve to controller-0's user, and instead resolve to the host that the task is being delegated from. If there is a mismatch between these two users, this causes problems. This has mostly not been an issue because our ci assumes that there is azuul
user everywhere, but people running local deployments have encountered issues when using the framework on a hypervisor with a non-zuul
user.