Releases: intel/PerfSpect
v3.8.0
What's Changed
Version 3.8.0 is a feature and maintenance release
New Features, Changes, and Enhancements
metrics
command now supports Intel Granite Rapids processors on Google Cloud (C4 instances)metrics
command's TMA metrics for Granite Rapids updated- Network IRQs table format improved to avoid one long line of data by adding separators that will allow wrapping
metrics
command no longer errors and exits when the PMU is determined to be in use, warning is generated instead- Intel Clearwater Forest now recognized and identified by
report
andconfig
commands - Intel Granite Rapids D now recognized and identified by
report
andconfig
commands - Intel Arrow Lake CPUs now recognized and identified by
report
command - AWS Graviton 4, ARM Neoverse-V2 CPUs now recognized and identified by
report
command
Fixes
- power and temperature benchmarks in
report
command now works on additional architectures by fixing turbostat output parsing - NIC table fixed in
report
command - race condition in
config
command fixed when setting multiple configuration options at the same time - memory benchmark in
report
command fixed when output format changed in newer MLC release - frequency benchmark in
report
command fixed when number of cores per die differs per die
Full Changelog: v3.7.0...v3.8.0
v3.7.0
What's Changed
Version 3.7.0 is a feature and maintenance release.
To install, download and extract the pre-built package (perfspect.tgz) from the Assets listed below.
New Features and Enhancements
- the metrics HTML report now supports comparing two sets of metrics
- metrics command can optionally expose a Prometheus compatible metrics endpoint using --prometheus-server and --prometheus-server-addr
- flame command can now target multiple PIDs using --pids
- flame command can now control the depth of the call stack using --max-depth
- eliminated the requirement to have Perl installed on the target for the flame command
- config command can now enable/disable c6 and c1-demotion
- config command can now configure LLC size on SRF and GNR
- config command can now enable/disable LLC prefetcher on SRF
- telemetry command now reports CPU temperature, IPC and C6 residency
- report command now includes vendor and model ID in the NIC table
- logs can now be directed to stdout using --log-stdout; useful when combined with the metrics prometheus server feature
- metrics command "PMU in use" error and exit changed to a warning
Fixes
- address problems found with collecting metrics for cgroups
- fix memory benchmark chart X-axis label from MB/s to GB/s
- fix index out of range error in renderXlsxTableMultiTarget
- fix determination of availability of fixed counters
Full Changelog: v3.6.1...v3.7.0
v3.6.1
Version 3.6.1 fixes a bug found in 3.6.0 when parsing non-padded HEX values for CPU frequencies.
To install, download and extract the pre-built package (perfspect.tgz) from the Assets listed below.
Full Changelog: v3.6.0...v3.6.1
v3.6.0
Version 3.6.0 is a feature and maintenance release.
To install, download and extract the pre-built package (perfspect.tgz) from the Assets listed below.
New Features & Enhancements
- The CPU frequency table from the report command now includes frequencies for SSE, AVX2, AVX512, and AMX, when supported by architecture
- Flamegraphs can now be limited to a specific process (PID)
- Prefetchers can be enabled/disabled with the config command
- A brief system configuration summary table has been added to the metrics, flame, lock, and telemetry reports
- Added preliminary support for the Intel Clearwater Forest CPU architecture
- The lock command can now retrieve a binary perf package that can be used for analysis off the target
- Added support for metrics, including per-transaction metrics, on EC2 m7a (AMD Genoa) and AMD Turin
Fixes
- The config command can now set the max core frequency on SRF and GNR
- The targets.yaml file no longer requires a value for the target name field
Breaking Changes
- Some flags for the config command have been renamed for consistency and readability. See
perfspect config -h
.
Full Changelog: v3.5.0...v3.6.0
v3.5.2
v3.5.2 is a bug-fix release (Note: v3.5.1 was a bad build/release and has since been deleted)
Two issues were found in 3.5.0 and are now fixed in 3.5.2.
- perfspect will exit with a panic when an incorrect command line argument is presented
- perfspect will exit with an error when falsely identifying the temp directory as being located on a file system mounted with 'noexec'
Full Changelog: v3.5.0...v3.5.2
v3.5.0
Version 3.5.0 is a feature and maintenance release with the following additions/fixes.
Breaking Change
- The --targettemp flag has been removed. Use the --tempdir flag to override the directory where collection scripts are executed.
New Features & Enhancements
- The --txnrate flag used with the metrics command now augments the metrics list with transaction-oriented metrics rather than replacing existing metrics.
- The --syslog flag redirects log output to the local syslog daemon. This is useful when running PerfSpect for long durations and/or running as a CRON job.
- Improved shutdown when PerfSpect receives SIGINT (ctrl-c).
- Added GNR prefetcher settings to report.
- Added clustering mode (SNC, UMA) for GNR and SRF to report.
- Added CPU frequency chart to telemetry report.
- Added table of network-related kernel parameters to report.
- Added TME (total memory encryption) on/off to report.
- Added TMA level 1 over time chart to the metrics HTML report.
- Added configured DIMM speed and DIMM rank to DIMM table.
Fixes
- Addressed incorrect measured CPU frequency chart on GNR when SNC is disabled.
- Addressed missing NIC information in report.
- Addressed error when /tmp is on a file system mounted with 'noexec' (use --tempdir to override).
- Addressed incorrect memory channels listed for SRF-AP.
Full Changelog: v3.4.0...v3.5.0
v3.4.0
Version 3.4.0 is a feature and maintenance release with the following additions/fixes.
New Features & Enhancements
- Gaudi device stats now included in the
telemetry
command report. Metrics
command event data can now be re-processed so that a previously unknown transaction rate (--txnrate) can be applied.- The
telemetry
command now accepts a duration value of zero (--duration 0) to run until interrupted by SIGINT (ctrl-c). - The
telemetry
command HTML report now includes time stamps on the x-axis of charts. - The
config
command now allows setting the compute and I/O die frequencies independently (SRF and GNR) - The branch misprediction metric was added to the
metrics
report. - The
report
command now includes the Speed Select Technology frequency table when it is enabled. - Added insight entry to
report
command to warn when ELC is configured in latency-optimized mode and EPB is non-zero. - The
report
andconfig
commands now determine which EPB configuration value (OS or BIOS) is active and report and/or change the appropriate entry. Report
command tables that are not relevant to a given CPU architecture are now not include in the output.
Fixes
- L3 per core reported by the
report
command was inaccurate on some CPU architectures - On multi-socket systems where a socket has been disabled via BIOS, the microarchitecture may be reported incorrectly.
What's Changed
- enable post-processing of pre-collected metric events by @harp-intel in #192
- enable indefinite duration for telemetry collection by @harp-intel in #203
- show timestamps in metrics summary and telemetry charts by @harp-intel in #205
- refactor html report generation to reduce duplication by @harp-intel in #206
- add branch mispredict ratio metric by @harp-intel in #207
- use remote target's perf for metrics collection if it is installed and new enough by @harp-intel in #208
- Highlight notes, tips, and warnings in README by @harp-intel in #209
- Bump github.com/spf13/cobra from 1.8.1 to 1.9.1 by @dependabot in #210
- add speed select turbo frequency tables by @harp-intel in #211
- fix report for l3 size per core when l3 instances are used by multipl… by @harp-intel in #213
- Get and set compute and I/O die max/min frequencies independently by @harp-intel in #216
- add example output images to README by @harp-intel in #218
- refactor scripts to use templating by @harp-intel in #222
- use alternate EPB value when configured to do so by @harp-intel in #223
- report tables associated with CPU models by @harp-intel in #225
- fix GNR_X* microarchitecture detection by @harp-intel in #227
- collect and report Gaudi device telemetry by @harp-intel in #217
Full Changelog: v3.3.1...v3.4.0
v3.3.1
Maintenance/Bug Fix Release
What's Changed
- Bump golang.org/x/term from 0.28.0 to 0.29.0 by @dependabot in #193
- Bump golang.org/x/text from 0.21.0 to 0.22.0 by @dependabot in #194
- fix regression in metrics command causing error on some AWS m5 and m6 instance types by @harp-intel in #196
- address race condition in metrics event processing by @harp-intel in #198
- process last set of events consistently by @harp-intel in #199
- send sigkill to child processes when receive sigint by @harp-intel in #201
Full Changelog: v3.3.0...v3.3.1
v3.3.0
Features/Enhancements:
- add instruction mix reporting to telemetry command by @harp-intel in #179
- add storage performance benchmark to 'report' command by @harp-intel in #161
- add 2nd level TMA metrics to chart by @harp-intel in #165
- add metadata tab to metrics summary HTML report by @harp-intel in #171
Maintenance/Bug Fixes:
- accept metric list without including "metric_" prefix by @harp-intel in #155
- add support for customized "no data found" message for any table by @harp-intel in #156
- recognize L3 cache size in GiB by @harp-intel in #158
- add cpu def for Turin zen 5c by @harp-intel in #160
- add benchmark descriptions by @harp-intel in #163
- add pid to ssh control master file name so that it doesn't get reused… by @harp-intel in #169
- if instructions event cannot be collected then assume target is not s… by @harp-intel in #170
- remove parse error from report, add to log by @harp-intel in #175
- provide sudo password to script when needed by @harp-intel in #177
- change flame and lock help to show 'all' is default format option by @harp-intel in #184
- fix bug where higher granularity metrics are not properly printed by @harp-intel in #182
- don't use bc in script as it is not available, by default, on some Linux OS distributions by @harp-intel in #189
- Bump github.com/spf13/pflag from 1.0.5 to 1.0.6 by @dependabot in #190
- fix metrics socket and cpu granularity metric calculations by @harp-intel in #191
Full Changelog: v3.2.0...v3.3.0
v3.2.0
What's Changed
Features/Enhancements:
- support for GCP C4 instances by @harp-intel in #134
- add AMD Turin CPU identifier by @harp-intel in #127
- make sure PMUs are not in use when running the metrics command by @harp-intel in #144
- Enable interruption of PerfSpect with SIGINT (ctrl-c) when collecting data over SSH by @harp-intel in #145
- limit config flags to specific uarchs by @harp-intel in #149
Maintenance/Bug Fixes:
- consider pkg control value when getting and setting EPP by @harp-intel in #142
- clean up benchmark summary table when no min latency collected by @harp-intel in #129
- perfspect report flag 'all' is true by default by @harp-intel in #128
- assume events not supported on failure when loading metadata for metrics command by @harp-intel in #139
- Bump golang.org/x/term from 0.27.0 to 0.28.0 by @dependabot in #140
- fix metrics HTML title field by @harp-intel in #147
- filter out infinite values in metrics summary by @harp-intel in #152
- Update README.md by @HarpPDX in #136
Full Changelog: v3.1.0...v3.2.0