Skip to content

feat: Automate file ownership setting from host migration process #609

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

cliffmccarthy
Copy link
Contributor

  • Added a step to deploy_chatmail() that sets ownership of paths that are copied over as part of the upgrade process.
  • Removed manual chown step from README.md.

- Added a step to deploy_chatmail() that sets ownership of paths that
  are copied over as part of the upgrade process.
- Removed manual chown step from README.md.
@cliffmccarthy
Copy link
Contributor Author

A possible reason for not proceeding with this change (at least in its current form) is that the chown operation is non-idempotent and runs every time the deployment process is run. This could possibly be time-consuming on a large server with a lot of state data, especially in /home/vmail/mail. I find it helpful to have these steps automated for moving server state to a new host, but I realize this might not be universally applicable, so this should be considered in the context of other chatmail servers that may use this code, especially the main one.

@missytake
Copy link
Contributor

And a changelog entry would be nice :)

@cliffmccarthy
Copy link
Contributor Author

And a changelog entry would be nice :)

Added, though this will probably become a merge conflict with #608, so after one is merged, I'll rebase the other (unless we decide not to proceed with this one (or the other one)).

@cliffmccarthy
Copy link
Contributor Author

I should clarify what I mean about chown above -- the command is in fact idempotent with regard to what it operates on, namely file ownership. When I say it is not idempotent, I'm referring to the fact that it bumps inode change times as a side effect, which is usually a minor detail but occasionally may be important. It seems like there should be a word for an operation that takes no action when the target is already in the desired state, but I can't think of one at the moment.

@missytake
Copy link
Contributor

I should clarify what I mean about chown above -- the command is in fact idempotent with regard to what it operates on, namely file ownership. When I say it is not idempotent, I'm referring to the fact that it bumps inode change times as a side effect, which is usually a minor detail but occasionally may be important. It seems like there should be a word for an operation that takes no action when the target is already in the desired state, but I can't think of one at the moment.

The word you're looking for is "idempotent" :) but if chown also bumps the inode change time, it does not only operate on file ownership, so I'd say it is not idempotent.

#
stateful_paths = {
"/etc/dkimkeys": "opendkim",
"/home/vmail/mail": "vmail",
Copy link
Contributor

@missytake missytake Aug 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized - /home/vmail/mail can be very large. On large instances, chowning it can take several minutes (e.g. almost 48 minutes on nine.testrun.org); not something we should call on every cmdeploy run.

I suggest that we only call the chown steps when --disable-mail was passed? Then they will not be executed during every deploy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspected this could be high overhead on a server like that, so I agree that we should make this something that runs outside the normal operational flow. It seems like there are probably other use cases for --disable-mail where you still wouldn't want to incur that 48-minute overhead, so I'm starting to think we need a different "verb" for fixing up the state of a migrated server, to be called as a separate step. Or we just keep those steps manual in the docs.

Thanks for submitting the issue in pyinfra!

@cliffmccarthy
Copy link
Contributor Author

Another possibility for how to deal with this occurred to me. What if we establish standard numeric UIDs for the users that own these directories? Then, any move between hosts that follow the standard would not need the chown. I assume there is the potential for these UIDs to "drift" depending on what exact set of packages is installed on the host, and that's what we're trying to address with the chown. Do we have a requirement to accommodate systems where these usernames are pre-existing with arbitrary UIDs? If not, forcing specific UIDs could be a solution. (And then we would need to re-order the migration steps to do the tar after the initial cmdeploy run so that the users exist.)

In fact, is doing the tar after creating the users (with arbitrary UIDs) sufficient to ensure proper ownership? I haven't tested this, but I think tar will attempt to use the username embedded in the archive, so as long as the same username exists on both sides, it should come across correctly. Is there a reason we have to populate the stateful paths before the first cmdeploy run, or is re-ordering the operations an option?

(There may still be other non-tar methods of transporting the data between systems that are sensitive to numeric UID, but with the tar-based instructions in the README, I think establishing the necessary users first might be sufficient. (And should we still document the ownership requirements, because of that?))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants