Skip to content

Conversation

@WBobby
Copy link
Contributor

@WBobby WBobby commented Dec 3, 2025

Motivation

Add a new tool to analyze Ninja build times and generate HTML reports, helping identify slow-building components in the ROCm build.

Issue Link: #2214

Technical Details

Changes:

  • Add analyze_build_times.py: parses .ninja_log and generates HTML report
  • Add report_build_time_template.html: HTML template for the report
  • Integrate into CI workflow to auto-generate reports after builds

Test Plan

Initiate Continuous Integration (CI) to validate the code.
Execute the test script in the local environment.

Test Result

https://therock-ci-artifacts.s3.amazonaws.com/19685707430-linux/logs/gfx94X-dcgpu/build_time_analysis.html
The script successfully generates the HTML file.

Submission Checklist

Add a new tool to analyze Ninja build times and generate HTML reports,
helping identify slow-building components in the ROCm build.

Changes:
- Add analyze_build_times.py: parses .ninja_log and generates HTML report
- Add report_build_time_template.html: HTML template for the report
- Integrate into CI workflow to auto-generate reports after builds
# =============================================================================


def get_system_info() -> Dict[str, str]:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for adding this to the report.

@amd-shiraz
Copy link

If possible we also want to add "Resource utilization" per component to the same report i n future. That will help with the work we are doing with @geomin12 on dockers dying due to lack of resources. for ASAN work.

@amd-shiraz
Copy link

@WBobby i had one more feedback pls if we can also incorporate. can we also add a total build time info at the end for components and dependencies ? thanks!

Calculate wall clock time from ninja log and display it in the
Build Information section alongside CPU, memory details.

Signed-off-by: Wang, Yanyao <[email protected]>
@kiran-thumma
Copy link
Contributor

Link the issue #2399 in PR a mandatory.

@WBobby WBobby requested a review from kiran-thumma December 6, 2025 00:53
Copy link
Contributor

@kiran-thumma kiran-thumma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank You for adding comments to use the script.

@WBobby WBobby merged commit 2e0faff into main Dec 8, 2025
232 of 241 checks passed
@WBobby WBobby deleted the users/yanywang/3-12-build-time-analysis branch December 8, 2025 22:21
@github-project-automation github-project-automation bot moved this from TODO to Done in TheRock Triage Dec 8, 2025
Copy link
Member

@ScottTodd ScottTodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm running ~1 week behind on reviews, sorry. Here's some post-merge feedback.

Comment on lines +158 to +162
# Analyze ninja build log to generate per-component timing report
- name: Analyze Build Times
if: ${{ !cancelled() }}
run: |
python3 build_tools/analyze_build_times.py --build-dir build
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work on Windows too?

Please keep Linux and Windows workflows in sync: https://github.com/ROCm/TheRock/blob/main/.github/workflows/build_windows_artifacts.yml

Comment on lines +267 to +270
# Build Time Analysis is only generated on Linux
if PLATFORM == "linux":
analysis_url = f"{bucket_url}/logs/{artifact_group}/build_time_analysis.html"
gha_append_step_summary(f"[Build Time Analysis]({analysis_url})")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? We should have these for Windows too.

Also note that multiple workflows call this post_build_upload.py script, so this file may not exist. See this for example:

# TODO: upload logs even if the build fails (separate from artifact upload?)
- name: Post Build Upload
if: ${{ github.repository_owner == 'ROCm' }}
run: |
python3 build_tools/github_actions/post_build_upload.py \
--run-id ${{ github.run_id }} \
--artifact-group "${{ matrix.target_bundle.amdgpu_family }}" \
--build-dir ${{ env.OUTPUT_DIR }}/build \
--upload

This should check if the file was found and uploaded, not if the current platform is Linux

Comment on lines +40 to +67
# Build name -> Display name mapping
NAME_MAPPING = {
"clr": "core-hip",
"ocl-clr": "core-ocl",
"ROCR-Runtime": "core-runtime",
"blas": "rocBLAS",
"prim": "rocPRIM",
"fft": "rocFFT",
"rand": "rocRAND",
"miopen": "MIOpen",
"hipdnn": "hipDNN",
"composable-kernel": "composable_kernel",
"support": "mxDataGenerator",
"host-suite-sparse": "SuiteSparse",
"rocwmma": "rocWMMA",
"miopen-plugin": "miopen_plugin",
}

# Top-level directories for ROCm components
ROCM_COMPONENT_DIRS = {
"base",
"compiler",
"core",
"comm-libs",
"dctools",
"profiler",
"ml-libs",
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need these mappings in this file? This may be difficult to maintain in the future as projects are moved around (https://github.com/ROCm/TheRock/blob/main/docs/rfcs/RFC0003-Build-Tree-Normalization.md)

Comment on lines +251 to +259
# Get CPU model from /proc/cpuinfo
try:
with open("/proc/cpuinfo", "r") as f:
for line in f:
if line.startswith("model name"):
info["cpu_model"] = line.split(":")[1].strip()
break
except (FileNotFoundError, IOError):
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely Linux-specific.

We have some code for this that is more portable here:

def device_cpu_status(self):
"""
**Warning:** This function may broken in Cluster systems.
Return CPU status, include its name, architecture, total cpu count.
-> `(CPU_NAME, CPU_ARCH, CPU_CORES)`
"""
import os, platform, subprocess, re
if self.is_windows:
_cpu_name = get_regedit(
"HKLM",
r"HARDWARE\DESCRIPTION\System\CentralProcessor\0",
"ProcessorNameString",
)
_cpu_arch = platform.machine()
_cpu_core = os.cpu_count()
return (_cpu_name, _cpu_core, _cpu_arch)
elif self.is_linux:
_cpu_name = (
re.search(
r"^\s*Model name:\s*(.+)$",
subprocess.run(
["lscpu"], capture_output=True, text=True, check=True
).stdout,
re.MULTILINE,
)
.group(1)
.strip()
)
_cpu_arch = platform.machine()
_cpu_core = os.cpu_count()
return (_cpu_name, _cpu_core, _cpu_arch)
else:
# <ADD BSD/Intel_MAC ???>
...
pass

Can we just run that env_check and pull that information into these analysis reports instead of duplicating code? Or we can pull those functions into a common module used for both scripts.

Comment on lines +320 to +330
def build_table_rows(
data: Dict[str, Dict[str, int]], phase_columns: List[str]
) -> List[tuple]:
"""Build sorted table rows from project phase data."""
rows = []
for name, phases in data.items():
total = sum(phases.values())
cols = [format_duration(phases.get(p, 0)) for p in phase_columns]
rows.append((name, cols, format_duration(total), total))
rows.sort(key=lambda x: x[3], reverse=True)
return [(r[0], r[1], r[2]) for r in rows]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to be able to sort https://therock-ci-artifacts.s3.amazonaws.com/19685707430-linux/logs/gfx94X-dcgpu/build_time_analysis.html based on project name too. Let's add "make the table dynamic" as a feature request for later

Comment on lines +320 to +330
def build_table_rows(
data: Dict[str, Dict[str, int]], phase_columns: List[str]
) -> List[tuple]:
"""Build sorted table rows from project phase data."""
rows = []
for name, phases in data.items():
total = sum(phases.values())
cols = [format_duration(phases.get(p, 0)) for p in phase_columns]
rows.append((name, cols, format_duration(total), total))
rows.sort(key=lambda x: x[3], reverse=True)
return [(r[0], r[1], r[2]) for r in rows]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More feature requests:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

6 participants