-
Notifications
You must be signed in to change notification settings - Fork 315
perf: Reduce stat copying #5329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Summary
This PR implements performance optimizations to reduce unnecessary copying and memory allocations in the runtime statistics system. The changes focus on two main areas: making `QueryID` parameters pass-by-reference throughout the subscriber system, and optimizing statistics data structures.The core optimization involves changing the Subscriber
trait interface to accept &QueryID
references instead of owned QueryID
values, eliminating atomic reference counting overhead from Arc<str>
clones on every method call. This change cascades through all subscriber implementations (Debug, Dashboard, Python) and the context notification system.
The second major change replaces StatSnapshotView
with StatSnapshotSend/StatSnapshotRecv
types that use SmallVec
for inline storage of the typical 3 statistics (CPU time, rows in, rows emitted), avoiding heap allocations for common cases. The Stat
enum also gains the Copy
trait since it contains only primitive types, further reducing cloning overhead.
The dashboard engine is simplified by removing intermediate serialization structures and directly using the optimized statistics types. The Python FFI layer also benefits from changing string parameters from owned String
to borrowed &str
references.
These optimizations target the telemetry and monitoring system which is called frequently during query execution, making the changes particularly beneficial for workloads with active dashboard monitoring or multiple subscribers.
Changed Files
Filename | Score | Overview |
---|---|---|
src/daft-local-execution/src/runtime_stats/subscribers/query.rs | 5/5 | Optimized query stats subscriber to pass query_id by reference and eliminate intermediate vector creation |
src/daft-context/src/subscribers/debug.rs | 5/5 | Updated DebugSubscriber to use QueryID references and StatSnapshotSend type |
src/daft-dashboard/src/state.rs | 4/5 | Replaced HashMap with StatSnapshotRecv for more efficient operator stats storage |
src/daft-dashboard/src/engine.rs | 4/5 | Refactored dashboard engine to use StatSnapshotRecv and eliminate intermediate serialization structures |
src/common/metrics/src/lib.rs | 4/5 | Added Copy trait to Stat enum and removed StatSnapshotView struct |
src/daft-context/src/subscribers/mod.rs | 4/5 | Changed Subscriber trait to use QueryID references and StatSnapshotSend type |
src/daft-context/src/subscribers/dashboard.rs | 5/5 | Optimized dashboard subscriber to use QueryID references and simplified stats serialization |
src/daft-context/src/python.rs | 5/5 | Changed Python FFI methods to use string references instead of owned strings |
src/daft-context/src/lib.rs | 5/5 | Updated context notification methods to pass QueryID by reference to subscribers |
src/daft-context/src/subscribers/python.rs | 5/5 | Updated Python subscriber to use QueryID references and iter() instead of into_iter() for stats |
Confidence score: 4/5
- This PR is safe to merge with low risk as it focuses on performance optimizations without changing core functionality
- Score reflects well-structured optimizations that maintain API compatibility while reducing memory overhead
- Pay close attention to src/common/metrics/src/lib.rs for the trait changes and type removals that affect multiple components
Sequence Diagram
sequenceDiagram
participant User
participant DaftContext
participant Subscriber
participant Dashboard
participant QueryExecution
participant RuntimeStats
User->>DaftContext: "Start Query"
DaftContext->>Subscriber: "notify_query_start(query_id, plan)"
Subscriber->>Dashboard: "POST /query/{id}/start"
Dashboard-->>Subscriber: "200 OK"
DaftContext->>Subscriber: "notify_optimization_start(query_id)"
Subscriber->>Dashboard: "POST /query/{id}/plan_start"
Dashboard-->>Subscriber: "200 OK"
DaftContext->>Subscriber: "notify_optimization_end(query_id, optimized_plan)"
Subscriber->>Dashboard: "POST /query/{id}/plan_end"
Dashboard-->>Subscriber: "200 OK"
DaftContext->>Subscriber: "on_exec_start(query_id, node_infos)"
Subscriber->>Dashboard: "POST /query/{id}/exec/start"
Dashboard-->>Subscriber: "200 OK"
QueryExecution->>RuntimeStats: "initialize_node(node_id)"
RuntimeStats->>Subscriber: "on_exec_operator_start(query_id, node_id)"
Subscriber->>Dashboard: "POST /query/{id}/exec/{op_id}/start"
Dashboard-->>Subscriber: "200 OK"
QueryExecution->>RuntimeStats: "emit stats"
RuntimeStats->>Subscriber: "on_exec_emit_stats(query_id, stats)"
Subscriber->>Dashboard: "POST /query/{id}/exec/emit_stats"
Dashboard-->>Subscriber: "200 OK"
QueryExecution->>RuntimeStats: "finalize_node(node_id)"
RuntimeStats->>Subscriber: "on_exec_operator_end(query_id, node_id)"
Subscriber->>Dashboard: "POST /query/{id}/exec/{op_id}/end"
Dashboard-->>Subscriber: "200 OK"
DaftContext->>Subscriber: "on_exec_end(query_id)"
Subscriber->>Dashboard: "POST /query/{id}/exec/end"
Dashboard-->>Subscriber: "200 OK"
DaftContext->>Subscriber: "notify_result_out(query_id, result)"
Note over Subscriber: "Collects preview rows"
DaftContext->>Subscriber: "notify_query_end(query_id)"
Subscriber->>Dashboard: "POST /query/{id}/end"
Dashboard-->>Subscriber: "200 OK"
Dashboard-->>User: "Query Complete"
Additional Comments (2)
10 files reviewed, 3 comments
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5329 +/- ##
==========================================
- Coverage 75.37% 75.22% -0.16%
==========================================
Files 983 986 +3
Lines 123738 123902 +164
==========================================
- Hits 93270 93202 -68
- Misses 30468 30700 +232
🚀 New features to boost your workflow:
|
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Changes Made
Related Issues
Checklist
docs/mkdocs.yml
navigation