Skip to content

Conversation

@2010YOUY01
Copy link
Contributor

Which issue does this PR close?

Part of #18217

Rationale for this change

In FilterExec, selectivity is calculated as output_rows/input_rows.
This PR supports such metric. I think this metrics provides important application-level insights, and would be commonly used, so it is displayed in the summary verbose level.

Demo in datafusion-cli

> set datafusion.explain.analyze_level = summary;
0 row(s) fetched.
Elapsed 0.000 seconds.

> explain analyze select * from generate_series(100) as t1(v1) where v1 <10;
+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type         | plan                                                                                                                                                                                   |
+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Plan with Metrics | ProjectionExec: expr=[value@0 as v1], metrics=[output_rows=10, elapsed_compute=1.763µs, output_bytes=64.0 KB]                                                                          |
|                   |   CoalesceBatchesExec: target_batch_size=8192, metrics=[output_rows=10, elapsed_compute=25.833µs, output_bytes=64.0 KB]                                                                |
|                   |     FilterExec: value@0 < 10, metrics=[output_rows=10, elapsed_compute=34.888µs, output_bytes=128.0 B, selectivity=9.9% (10/101)]                                                      |
|                   |       RepartitionExec: partitioning=RoundRobinBatch(14), input_partitions=1, metrics=[]                                                                                                |
|                   |         LazyMemoryExec: partitions=1, batch_generators=[generate_series: start=0, end=100, batch_size=8192], metrics=[output_rows=101, elapsed_compute=33.167µs, output_bytes=64.0 KB] |
|                   |                                                                                                                                                                                        |
+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.004 seconds.

What changes are included in this PR?

  1. Add a new MetricValue for ratio.
  2. Tracking selectivity in FilterExec with MetricValue::Ratio

Are these changes tested?

UT

Are there any user-facing changes?

No

@github-actions github-actions bot added core Core DataFusion crate physical-plan Changes to the physical-plan crate labels Oct 31, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me, though I have a suggestion to reduce the duplication between this and pruned metrics

///
/// The counters are thread-safe and shared across clones.
#[derive(Debug, Clone, Default)]
pub struct RatioMetrics {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is basically the same as Pruned metrics except the display is different -- I wonder if we could consolidate the two somehow 🤔

Copy link
Member

@xudong963 xudong963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, love this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants