Skip to content

Conversation

kpedro88
Copy link
Contributor

What does the PR do?

This PR adds three new flags and updates one flag in build.py. These changes all go toward increasing build flexibility.

  1. Add flag --no-container-cache, which propagates to docker --no-cache.
  2. Add flag --default-repo-tag <tag> to override the calculated default value, which is not always appropriate. For example, when trying to build a dev version (currently 25.08), the upstream container version is set to the previous version (25.07), which takes precedence in the calculated default value, but using the corresponding dev versions of component and backend repositories may be intended. Rather than having to override it for each repo, it is useful to be able to override it globally.
  3. Add flag --use-buildbase to use the temporary "buildbase" image as the "base" image for backends that need it (e.g. onnxruntime).
  4. Extend --backend syntax to <backend-name>[:<repo-tag>][:<org>] to allow specifying a different organization/repository, in addition to a different tag/branch. This is useful for external contributors developing in forks.

Checklist

  • I have read the Contribution guidelines and signed the Contributor License
    Agreement
  • PR title reflects the change and is of format <commit_type>: <Title>
  • Changes are described in the pull request.
  • Related issues are referenced.
  • Populated github labels field
  • Added test plan and verified test passes.
  • Verified that the PR passes existing CI.
  • I ran pre-commit locally (pre-commit install, pre-commit run --all)
  • Verified copyright is correct on all changed files.
  • Added succinct git squash message before merging ref.
  • All template sections are filled out.
  • Optional: Additional screenshots for behavior/output changes with before/after.

Commit Type:

Check the conventional commit type
box here and add the label to the github PR.

  • build
  • ci
  • docs
  • feat
  • fix
  • perf
  • refactor
  • revert
  • style
  • test

Related PRs:

N/A

Where should the reviewer start?

Changes only affect build.py and associated documentation.

Test plan:

These changes only affect the build script, so there is no impact on server functionality.

An example of using some of these options:

in branch r25.08 (without these changes):

./build.py --target-platform linux -j 24 -v --enable-all --dryrun
output:

Building Triton Inference Server
platform linux
machine x86_64
version 2.60.0dev
build dir /storage/local/data2/pedrok/sonic/server/build
install dir None
cmake dir None
default repo-tag: r25.07
container version 25.08dev
upstream container version 25.07
endpoint "http"
endpoint "grpc"
endpoint "sagemaker"
endpoint "vertex-ai"
filesystem "gcs"
filesystem "s3"
filesystem "azure_storage"
backend "ensemble" at tag/branch "r25.07"
backend "identity" at tag/branch "r25.07"
backend "square" at tag/branch "r25.07"
backend "repeat" at tag/branch "r25.07"
backend "onnxruntime" at tag/branch "r25.07"
backend "python" at tag/branch "r25.07"
backend "dali" at tag/branch "r25.07"
backend "pytorch" at tag/branch "r25.07"
backend "openvino" at tag/branch "r25.07"
backend "fil" at tag/branch "r25.07"
backend "tensorrt" at tag/branch "r25.07"
repoagent "checksum" at tag/branch "r25.07"
cache "local" at tag/branch "r25.07"
cache "redis" at tag/branch "r25.07"
component "common" at tag/branch "r25.07"
component "core" at tag/branch "r25.07"
component "backend" at tag/branch "r25.07"
component "thirdparty" at tag/branch "r25.07"

After merging this branch into r25.08, the following is possible:

./build.py --target-platform linux -j 24 -v --enable-all --backend onnxruntime:r25.08_kjp:https://github.com/kpedro88 --default-repo-tag r25.08 --use-buildbase --dryrun
output:

Building Triton Inference Server
platform linux
machine x86_64
version 2.60.0dev
build dir /storage/local/data2/pedrok/sonic/server/build
install dir None
cmake dir None
default repo-tag: r25.08
container version 25.08dev
upstream container version 25.07
endpoint "http"
endpoint "grpc"
endpoint "sagemaker"
endpoint "vertex-ai"
filesystem "gcs"
filesystem "s3"
filesystem "azure_storage"
backend "onnxruntime" at tag/branch "r25.08_kjp" from org "https://github.com/kpedro88"
backend "ensemble" at tag/branch "r25.08" from org "https://github.com/triton-inference-server"
backend "identity" at tag/branch "r25.08" from org "https://github.com/triton-inference-server"
backend "square" at tag/branch "r25.08" from org "https://github.com/triton-inference-server"
backend "repeat" at tag/branch "r25.08" from org "https://github.com/triton-inference-server"
backend "python" at tag/branch "r25.08" from org "https://github.com/triton-inference-server"
backend "dali" at tag/branch "r25.08" from org "https://github.com/triton-inference-server"
backend "pytorch" at tag/branch "r25.08" from org "https://github.com/triton-inference-server"
backend "openvino" at tag/branch "r25.08" from org "https://github.com/triton-inference-server"
backend "fil" at tag/branch "r25.08" from org "https://github.com/triton-inference-server"
backend "tensorrt" at tag/branch "r25.08" from org "https://github.com/triton-inference-server"
repoagent "checksum" at tag/branch "r25.08"
cache "local" at tag/branch "r25.08"
cache "redis" at tag/branch "r25.08"
component "common" at tag/branch "r25.08"
component "core" at tag/branch "r25.08"
component "backend" at tag/branch "r25.08"
component "thirdparty" at tag/branch "r25.08"

Caveats:

The syntax for changing the backend org could be more elegant, and the feature is not extended to other components besides backends. I am planning a more thorough improvement to build.py that will address this in a followup PR, but it will be a more involved change and is still in progress.

Background

These changes were useful in building a 25.08dev server to incorporate #8335, which includes an important bug fix and is not yet in a released version.

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

N/A

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances the build.py script with four new command-line flags to increase build flexibility, particularly for development workflows and external contributors working with forks.

  • Adds --no-container-cache flag to disable Docker cache during container builds
  • Adds --default-repo-tag flag to override calculated default repository tags globally
  • Adds --use-buildbase flag to use the temporary "buildbase" image for backend builds
  • Extends --backend syntax to support custom organizations/repositories

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
build.py Implements the four new flags and their associated logic, including regex parsing for backend syntax and image mapping updates
docs/customization_guide/build.md Documents the new --use-buildbase flag and extended backend syntax with organization support

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +2903 to +2904
pattern = r"(https?:\/\/[^\s:]+)|:"
parts = list(filter(None,re.split(pattern, be)))
Copy link
Preview

Copilot AI Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern and splitting logic is complex and may be difficult to maintain. Consider extracting this parsing logic into a separate function with clear documentation about the expected input format and returned structure.

Copilot uses AI. Check for mistakes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. My followup branch is not quite finished yet, but involves a more thorough redesign that will avoid the need for any custom parsing.

for be in FLAGS.backend:
parts = be.split(":")
pattern = r"(https?:\/\/[^\s:]+)|:"
parts = list(filter(None,re.split(pattern, be)))
Copy link
Preview

Copilot AI Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing space after comma in function call. Should be filter(None, re.split(pattern, be)).

Suggested change
parts = list(filter(None,re.split(pattern, be)))
parts = list(filter(None, re.split(pattern, be)))

Copilot uses AI. Check for mistakes.

@whoisj
Copy link
Contributor

whoisj commented Sep 17, 2025

Thanks for the contribution! I know we've been distracted, but we're doing our best to catch back up.

This looks good to me.

@mc-nv is the expert here. I believe he's out of office until next week, hopefully after he's back and the dust settled, I can get his review here.

@whoisj whoisj added enhancement New feature or request build Issues pertaining to builds labels Sep 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Issues pertaining to builds enhancement New feature or request
Development

Successfully merging this pull request may close these issues.

2 participants