Skip to content

[Feature Request] Enhance Profile API to show (star-tree/other) pre-computation time #19295

@sandeshkr419

Description

@sandeshkr419

Is your feature request related to a problem? Please describe

Currently, when a query is optimized using a star-tree or otherwise within a pre-compute phase, the time spent in this critical pre-computation phase is not explicitly visible in the Profile API output. This creates a significant observability gap, making it difficult to debug performance or quantify the benefits of the feature.

pre-compute phase: currently most of the aggregations support different pre-compute steps which skips iterating through the matching documents.

The profile output below is from a multi_terms aggregation that was successfully pre-computed by a star-tree index. It clearly demonstrates the current lack of visibility:

// ... (profile output for one shard)
"aggregations": [
    {
        "type": "MultiTermsAggregator",
        "description": "ip_and_status_combinations",
        "time_in_nanos": 15404082,
        "breakdown": {
            "build_leaf_collector": 14972083,
            "build_aggregation": 427750,
            "collect": 0,
            "post_collection": 958,
            "initialize": 3291
        }
    }
]

Observations from this output:

"collect": 0: The collect phase, which normally iterates through documents, took zero time. This correctly indicates that the standard collection path was bypassed by an optimization.

"build_leaf_collector": 14,972,083 nanos (~15ms): This phase accounts for ~97% of the total aggregation time. This is highly misleading. The work done was not "building a leaf collector"; it was the entire process of scanning the star-tree index and creating buckets from the pre-aggregated results. The metric's name does not reflect the actual work performed.

No mention of "Star-Tree / pre-computation": The profile gives no explicit confirmation that the star-tree index was used. A user can only infer it from the unusual timing distribution (collect: 0).

Describe the solution you'd like

We need a dedicated profiling entity for pre-computation phases, with a specific breakdown for star-tree operations.
The ideal output would separate the pre-computation work from the standard aggregation lifecycle phases.

Something like this might be more informative:

"aggregations": [
    {
        "type": "MultiTermsAggregator",
        "description": "ip_and_status_combinations",
        "time_in_nanos": 15404082,
        "breakdown": {
            "pre_compute": 14972083,           // <-- NEW: Top-level phase for pre-computation work
            "build_aggregation": 427750,
            "build_leaf_collector": 3500,      // <-- Now shows a realistic, small setup time
            "collect": 0,                      // <-- Still 0, as expected
            "post_collection": 958,
            "initialize": 3291
        },
        "children": [
            {
                "type": "StarTree",            // <-- NEW: A child profiler identifying the pre-compute source
                "time_in_nanos": 14972083,
                "description": "Pre-computation using star-tree index",
                "breakdown": {
                    // (Optional but ideal) Further breakdown of the star-tree work itself
                    "scan_star_tree_segments": 12000000,
                    "build_buckets_from_star_tree": 2972083
                }
            }
        ]
    }
]

Related component

No response

Describe alternatives you've considered

To start with, a slightly less granular framework can be the first iteration and granular breakdowns coming in later iteratively.

Additional context

Gaps identified as part of #19017

Metadata

Metadata

Assignees

Type

No type

Projects

Status

🆕 New

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions