Skip to content

Conversation

pingtimeout
Copy link
Contributor

@pingtimeout pingtimeout commented Aug 19, 2025

This PR is the next iteration of release automation. It builds on top of #2156 and reuses the common bash libraries that were defined.

Differences from the initial PR

  • Release automation can only be triggered via Github Workflows. It is not possible to perform a semi-automated release from a committer/PMC computer.
  • It assumes that the following secrets are defined:
    • DOCKERHUB_USERNAME and DOCKERHUB_TOKEN - the credentials that can be used to push Docker images to Dockerhub
    • GPG_PRIVATE_KEY and GPG_PASSPHRASE - the ASCII armored private key that should be used to sign artifacts and its associated passphrase. The associated public key is assumed to be added to the KEYS file prior to being entered here.
    • APACHE_USERNAME and APACHE_PASSWORD - the credentials that can be used to connect to the ASF SVN server as well as to the ASF Nexus server.
  • All code that was used to perform release steps locally has been removed to keep the PR as small as possible.

Similarities with the initial PR

This PR builds on the same assumptions from the initial PR:

  • Release cannot be fully automated as of today as there are concerns that the release guide may not be comprehensive. Hence full automation is not desirable yet.
  • Release can only be semi-automated given that certain operations must be manually performed by the release manager.

Remaining known-unknowns

I can see that apache/polaris has a Nightly build github workflow that publishes snapshots every night to the Apache Nexus repository. In this workflow's definition, I can find references two Nexus credentials (secrets.NEXUS_USER and secrets.NEXUS_PW). However I cannot find any such secret defined on https://github.com/apache/polaris/settings/secrets/actions, nor am I sure whether those credentials can be used to interact with the ASF SVN server. Some clarification is needed to ensure proper credentials configuration.

Example runs

I have used this PR to simulate the release of Polaris 99.98.97-incubating-rc1 on my own fork. No upload was performed, this is just to prove out that things should be working as expected. You can find links to the following workflow executions:

  • This Create Release Branch workflow was used to cut the release branch with proper naming pattern: release/99.98.97-incubating. It was run with dry-run=0 so that the release branch was actually created.
  • This Update version and Changelog for Release Candidate workflow was used to set the Polaris version to 99.98.97-incubating, update the changelog and push the RC1 tag. It was also run with dry-run=0 so that the modifications were actually performed (on my fork only)
  • For comparison, this Update version and Changelog for Release Candidate workflow is what happens when we try to cut an RC from a commit where some Github checks have failed (e.g. CI)
  • From there, all subsequent workflows were run with dry-run=1 so that no interaction with any ASF server happened.
  • This Build and Publish Release Artifacts workflow was run with dry-run=1 to check the commands that would be executed, if an actual release was to be performed. We can see the binaries and Helm Charts would be built, signed, checksum'ed and publish to ASF SVN and Nexus repositories. Docker images would be built but not published.
  • And finally, this Publish Release After Vote Success workflow shows how the artifacts would be moved from the dist dev space to the dist release space on ASF SVN, the final release tag would be created and the Nexus repository would be automatically released. This is also where the Github Release itself would be created.

Next steps

If this PR is approved, it should be possible to use the Create Release Branch and Update Release Candidate workflows to respectively cut the release branch and set the RC version as well as create the RC tag.

Recommendations for reviewers

  • Please review the releasey/release-process-flowchart.png flowchart first. It shows how each workflow participates to the overall release process. It should substantially help make sense of the Github workflows for the rest of the review.

Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this!

This looks much simpler and non-controversial (because there's just one very well defined release runtime environment).

workflows vs jobs

The workflows 2, 3, 4, 5 should IMHO be separate jobs but one workflow.
It's one (human) interaction, so it should be one workflow. The individual failed jobs can be re-run in case of a failure. All validation steps only needs to happen once.

Having one workflow instead of multiple is easier to reason about. It's overall not more complex than one workflow and the dependencies are clear.

The jobs and their dependencies I have in mind are these

+--------------------------+
| Validation & preparation |
+--------------------------+
           |
+--------------------------+
| Update ver & changelog   |
+--------------------------+
           |
           |          +-------------------------------------+
           +----------| Build and Publish Release Artifacts |-----+
           |          +-------------------------------------+     |
           |                                                      |
           |          +-------------------------------------+     |
           +----------| Build and Publish Docker Images     |     |
                      +------------------+------------------+     |
                                         |                        |
                                         |                        |
                      +------------------+------------------+     |
                      | Build and Stage Helm Charts         |-----+
                      +-------------------------------------+     |
                                                                  |   +--------------------------------------+
                                                                  +---| Finish, mail templates & information |
                                                                      +--------------------------------------+

(Haven't checked whether the Helm Charts part has to depend on the Docker images or not.)

It would look similarly in the GH workflow run summary. Here's an example of a release workflow, which uses multiple jobs.

It would also remove the need for the if [[ "${{ github.event_name }}" == "workflow_dispatch" ]]; then blocks.

GH archives & release attachments

I think it's useful to use the GH upload action to archive some artifacts like the binary distribution tarballs/zips, helm tarball/zip. Those can be eventually attached to the GH release, which makes it quite convenient for users to just download the artifacts they want from a GH release page.

Some "FYI"

  • Github offers "step summaries" that appear workflow run summary.
    You can write markdown formatted output to $GITHUB_STEP_SUMMARY.
    For example, this workflow code produces this corresponding workflow summary part.

  • Another one: If you wanna group things together, you can use these "output groups".
    For example, something like

    echo "::group::my group title"
    echo "foo"
    echo "bar"
    echo "baz"
    echo "::endgroup::"

    Writes "foo bar baz" to the workflow log as an expandable group "my group title".

--target "${new_tag_ref}"

print_success "🎉 GitHub release created: ${release_title}"
else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what's missing here is the cleanup of the previous RC, like purging data in SVN and deleting the previous's RC Nexus staging repo.

@pingtimeout
Copy link
Contributor Author

pingtimeout commented Aug 25, 2025

Thanks for the review @snazy, I have updated the code to leverage step summaries and so that workflows 3, 4 and 5 are merged into jobs of a single workflow. For now, I am keeping the separation between workflows 2 and 3 for testing purposes: it is possible to run workflow 2 in real mode (create branch, update changelog, ...) but workflows 3 and above should only be run in dry-run mode (no publication to Apache, etc...) until we have the infra setup to perform automatic releases (secrets).

I am going to run a bunch of test runs with those changes and will keep this PR updated.

EDIT: I updated this PR description with the new workflow runs that contain github step summaries

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants