-
Couldn't load subscription status.
- Fork 314
Extract SparkPlan product and append to trace #9783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…ort more types and use JSON arrays
|
🎯 Code Coverage 🔗 Commit SHA: ef18062 | Docs | Was this helpful? Give us feedback! |
BenchmarksStartupParameters
See matching parameters
SummaryFound 1 performance improvements and 1 performance regressions! Performance is the same for 51 metrics, 12 unstable metrics.
Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.55.0-SNAPSHOT~ef18062877, baseline=1.55.0-SNAPSHOT~b733cdaf4e
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.032 s) : 0, 1031774
Total [baseline] (10.726 s) : 0, 10726027
Agent [candidate] (1.014 s) : 0, 1014336
Total [candidate] (10.687 s) : 0, 10686893
section appsec
Agent [baseline] (1.202 s) : 0, 1202484
Total [baseline] (10.915 s) : 0, 10914976
Agent [candidate] (1.192 s) : 0, 1192260
Total [candidate] (11.03 s) : 0, 11030405
section iast
Agent [baseline] (1.169 s) : 0, 1168628
Total [baseline] (11.226 s) : 0, 11226264
Agent [candidate] (1.151 s) : 0, 1151428
Total [candidate] (11.011 s) : 0, 11010860
section profiling
Agent [baseline] (1.183 s) : 0, 1183168
Total [baseline] (10.934 s) : 0, 10933993
Agent [candidate] (1.165 s) : 0, 1164675
Total [candidate] (11.078 s) : 0, 11078204
gantt
title petclinic - break down per module: candidate=1.55.0-SNAPSHOT~ef18062877, baseline=1.55.0-SNAPSHOT~b733cdaf4e
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.459 ms) : 0, 1459
crashtracking [candidate] (1.473 ms) : 0, 1473
BytebuddyAgent [baseline] (703.438 ms) : 0, 703438
BytebuddyAgent [candidate] (692.696 ms) : 0, 692696
GlobalTracer [baseline] (245.243 ms) : 0, 245243
GlobalTracer [candidate] (241.781 ms) : 0, 241781
AppSec [baseline] (32.528 ms) : 0, 32528
AppSec [candidate] (32.304 ms) : 0, 32304
Debugger [baseline] (6.414 ms) : 0, 6414
Debugger [candidate] (6.391 ms) : 0, 6391
Remote Config [baseline] (669.059 µs) : 0, 669
Remote Config [candidate] (697.697 µs) : 0, 698
Telemetry [baseline] (14.322 ms) : 0, 14322
Telemetry [candidate] (9.279 ms) : 0, 9279
Flare Poller [baseline] (6.491 ms) : 0, 6491
Flare Poller [candidate] (8.628 ms) : 0, 8628
section appsec
crashtracking [baseline] (1.45 ms) : 0, 1450
crashtracking [candidate] (1.462 ms) : 0, 1462
BytebuddyAgent [baseline] (726.0 ms) : 0, 726000
BytebuddyAgent [candidate] (716.126 ms) : 0, 716126
GlobalTracer [baseline] (235.925 ms) : 0, 235925
GlobalTracer [candidate] (234.597 ms) : 0, 234597
AppSec [baseline] (174.082 ms) : 0, 174082
AppSec [candidate] (175.238 ms) : 0, 175238
Debugger [baseline] (5.893 ms) : 0, 5893
Debugger [candidate] (6.108 ms) : 0, 6108
Remote Config [baseline] (630.172 µs) : 0, 630
Remote Config [candidate] (622.401 µs) : 0, 622
Telemetry [baseline] (8.378 ms) : 0, 8378
Telemetry [candidate] (8.459 ms) : 0, 8459
Flare Poller [baseline] (3.938 ms) : 0, 3938
Flare Poller [candidate] (3.9 ms) : 0, 3900
IAST [baseline] (25.068 ms) : 0, 25068
IAST [candidate] (24.671 ms) : 0, 24671
section iast
crashtracking [baseline] (1.494 ms) : 0, 1494
crashtracking [candidate] (1.453 ms) : 0, 1453
BytebuddyAgent [baseline] (828.975 ms) : 0, 828975
BytebuddyAgent [candidate] (815.275 ms) : 0, 815275
GlobalTracer [baseline] (234.958 ms) : 0, 234958
GlobalTracer [candidate] (231.694 ms) : 0, 231694
AppSec [baseline] (30.204 ms) : 0, 30204
AppSec [candidate] (35.266 ms) : 0, 35266
Debugger [baseline] (6.203 ms) : 0, 6203
Debugger [candidate] (6.158 ms) : 0, 6158
Remote Config [baseline] (602.278 µs) : 0, 602
Remote Config [candidate] (603.838 µs) : 0, 604
Telemetry [baseline] (8.548 ms) : 0, 8548
Telemetry [candidate] (8.654 ms) : 0, 8654
Flare Poller [baseline] (4.215 ms) : 0, 4215
Flare Poller [candidate] (4.219 ms) : 0, 4219
IAST [baseline] (31.95 ms) : 0, 31950
IAST [candidate] (26.516 ms) : 0, 26516
section profiling
crashtracking [baseline] (1.469 ms) : 0, 1469
crashtracking [candidate] (1.442 ms) : 0, 1442
BytebuddyAgent [baseline] (732.499 ms) : 0, 732499
BytebuddyAgent [candidate] (724.027 ms) : 0, 724027
GlobalTracer [baseline] (221.845 ms) : 0, 221845
GlobalTracer [candidate] (218.398 ms) : 0, 218398
AppSec [baseline] (33.282 ms) : 0, 33282
AppSec [candidate] (32.347 ms) : 0, 32347
Debugger [baseline] (9.831 ms) : 0, 9831
Debugger [candidate] (7.352 ms) : 0, 7352
Remote Config [baseline] (1.495 ms) : 0, 1495
Remote Config [candidate] (724.946 µs) : 0, 725
Telemetry [baseline] (11.524 ms) : 0, 11524
Telemetry [candidate] (15.363 ms) : 0, 15363
Flare Poller [baseline] (4.303 ms) : 0, 4303
Flare Poller [candidate] (4.102 ms) : 0, 4102
ProfilingAgent [baseline] (111.151 ms) : 0, 111151
ProfilingAgent [candidate] (107.651 ms) : 0, 107651
Profiling [baseline] (111.807 ms) : 0, 111807
Profiling [candidate] (108.629 ms) : 0, 108629
Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.55.0-SNAPSHOT~ef18062877, baseline=1.55.0-SNAPSHOT~b733cdaf4e
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.038 s) : 0, 1038089
Total [baseline] (8.669 s) : 0, 8669239
Agent [candidate] (1.018 s) : 0, 1017927
Total [candidate] (8.659 s) : 0, 8658700
section iast
Agent [baseline] (1.163 s) : 0, 1162866
Total [baseline] (9.366 s) : 0, 9366449
Agent [candidate] (1.149 s) : 0, 1149108
Total [candidate] (9.3 s) : 0, 9299574
gantt
title insecure-bank - break down per module: candidate=1.55.0-SNAPSHOT~ef18062877, baseline=1.55.0-SNAPSHOT~b733cdaf4e
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.494 ms) : 0, 1494
crashtracking [candidate] (1.45 ms) : 0, 1450
BytebuddyAgent [baseline] (707.837 ms) : 0, 707837
BytebuddyAgent [candidate] (694.688 ms) : 0, 694688
GlobalTracer [baseline] (246.777 ms) : 0, 246777
GlobalTracer [candidate] (242.552 ms) : 0, 242552
AppSec [baseline] (32.614 ms) : 0, 32614
AppSec [candidate] (32.462 ms) : 0, 32462
Debugger [baseline] (6.519 ms) : 0, 6519
Debugger [candidate] (6.428 ms) : 0, 6428
Remote Config [baseline] (681.449 µs) : 0, 681
Remote Config [candidate] (693.675 µs) : 0, 694
Telemetry [baseline] (12.956 ms) : 0, 12956
Telemetry [candidate] (9.283 ms) : 0, 9283
Flare Poller [baseline] (7.979 ms) : 0, 7979
Flare Poller [candidate] (9.279 ms) : 0, 9279
section iast
crashtracking [baseline] (1.476 ms) : 0, 1476
crashtracking [candidate] (1.473 ms) : 0, 1473
BytebuddyAgent [baseline] (824.723 ms) : 0, 824723
BytebuddyAgent [candidate] (813.417 ms) : 0, 813417
GlobalTracer [baseline] (234.172 ms) : 0, 234172
GlobalTracer [candidate] (230.982 ms) : 0, 230982
AppSec [baseline] (28.921 ms) : 0, 28921
AppSec [candidate] (35.485 ms) : 0, 35485
Debugger [baseline] (6.181 ms) : 0, 6181
Debugger [candidate] (6.139 ms) : 0, 6139
Remote Config [baseline] (599.294 µs) : 0, 599
Remote Config [candidate] (612.825 µs) : 0, 613
Telemetry [baseline] (8.523 ms) : 0, 8523
Telemetry [candidate] (8.705 ms) : 0, 8705
Flare Poller [baseline] (4.129 ms) : 0, 4129
Flare Poller [candidate] (4.328 ms) : 0, 4328
IAST [baseline] (32.841 ms) : 0, 32841
IAST [candidate] (26.558 ms) : 0, 26558
LoadParameters
See matching parameters
SummaryFound 4 performance improvements and 5 performance regressions! Performance is the same for 3 metrics, 12 unstable metrics.
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.55.0-SNAPSHOT~ef18062877, baseline=1.55.0-SNAPSHOT~b733cdaf4e
dateFormat X
axisFormat %s
section baseline
no_agent (4.409 ms) : 4358, 4460
. : milestone, 4409,
iast (9.86 ms) : 9695, 10024
. : milestone, 9860,
iast_FULL (14.124 ms) : 13841, 14407
. : milestone, 14124,
iast_GLOBAL (10.53 ms) : 10328, 10732
. : milestone, 10530,
profiling (9.278 ms) : 9130, 9426
. : milestone, 9278,
tracing (7.997 ms) : 7879, 8116
. : milestone, 7997,
section candidate
no_agent (4.241 ms) : 4194, 4287
. : milestone, 4241,
iast (9.436 ms) : 9278, 9593
. : milestone, 9436,
iast_FULL (14.198 ms) : 13911, 14485
. : milestone, 14198,
iast_GLOBAL (11.566 ms) : 11355, 11778
. : milestone, 11566,
profiling (8.902 ms) : 8754, 9050
. : milestone, 8902,
tracing (7.607 ms) : 7500, 7714
. : milestone, 7607,
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.55.0-SNAPSHOT~ef18062877, baseline=1.55.0-SNAPSHOT~b733cdaf4e
dateFormat X
axisFormat %s
section baseline
no_agent (36.221 ms) : 35925, 36517
. : milestone, 36221,
appsec (47.162 ms) : 46746, 47578
. : milestone, 47162,
code_origins (44.42 ms) : 44028, 44811
. : milestone, 44420,
iast (44.499 ms) : 44117, 44880
. : milestone, 44499,
profiling (49.007 ms) : 48522, 49492
. : milestone, 49007,
tracing (43.014 ms) : 42646, 43382
. : milestone, 43014,
section candidate
no_agent (37.654 ms) : 37343, 37965
. : milestone, 37654,
appsec (48.99 ms) : 48560, 49419
. : milestone, 48990,
code_origins (44.434 ms) : 44063, 44806
. : milestone, 44434,
iast (46.33 ms) : 45923, 46738
. : milestone, 46330,
profiling (48.577 ms) : 48098, 49056
. : milestone, 48577,
tracing (44.829 ms) : 44457, 45202
. : milestone, 44829,
DacapoParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics. Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.55.0-SNAPSHOT~ef18062877, baseline=1.55.0-SNAPSHOT~b733cdaf4e
dateFormat X
axisFormat %s
section baseline
no_agent (1.472 ms) : 1461, 1484
. : milestone, 1472,
appsec (3.708 ms) : 3487, 3930
. : milestone, 3708,
iast (2.191 ms) : 2128, 2255
. : milestone, 2191,
iast_GLOBAL (2.239 ms) : 2175, 2303
. : milestone, 2239,
profiling (2.078 ms) : 2024, 2131
. : milestone, 2078,
tracing (2.024 ms) : 1974, 2073
. : milestone, 2024,
section candidate
no_agent (1.473 ms) : 1462, 1485
. : milestone, 1473,
appsec (3.699 ms) : 3481, 3918
. : milestone, 3699,
iast (2.207 ms) : 2144, 2271
. : milestone, 2207,
iast_GLOBAL (2.265 ms) : 2200, 2330
. : milestone, 2265,
profiling (2.054 ms) : 2003, 2106
. : milestone, 2054,
tracing (2.022 ms) : 1973, 2072
. : milestone, 2022,
Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.55.0-SNAPSHOT~ef18062877, baseline=1.55.0-SNAPSHOT~b733cdaf4e
dateFormat X
axisFormat %s
section baseline
no_agent (15.228 s) : 15228000, 15228000
. : milestone, 15228000,
appsec (15.075 s) : 15075000, 15075000
. : milestone, 15075000,
iast (18.555 s) : 18555000, 18555000
. : milestone, 18555000,
iast_GLOBAL (18.179 s) : 18179000, 18179000
. : milestone, 18179000,
profiling (15.352 s) : 15352000, 15352000
. : milestone, 15352000,
tracing (15.107 s) : 15107000, 15107000
. : milestone, 15107000,
section candidate
no_agent (15.01 s) : 15010000, 15010000
. : milestone, 15010000,
appsec (15.112 s) : 15112000, 15112000
. : milestone, 15112000,
iast (18.913 s) : 18913000, 18913000
. : milestone, 18913000,
iast_GLOBAL (17.586 s) : 17586000, 17586000
. : milestone, 17586000,
profiling (15.379 s) : 15379000, 15379000
. : milestone, 15379000,
tracing (15.293 s) : 15293000, 15293000
. : milestone, 15293000,
|
dc41615 to
d9d6213
Compare
| // Should really only return valid JSON types (Array, Map, String, Boolean, Number, null) | ||
| public Object parsePlanProduct(Object value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love that this method returns an Object instead of something definite like a JSON node (or even just a String). The end goal is to allow any JSON object (other than null, which we filter out) to be serialized into a string using writeObjectToString, and this seemed like the most straightforwards way to achieve that. There's probably some more idiomatic way I'm missing - happy to hear about it if anyone has ideas!
d9d6213 to
54ab1ad
Compare
54ab1ad to
0279fff
Compare
| public static void exit( | ||
| @Advice.Return(readOnly = false) SparkPlanInfo planInfo, | ||
| @Advice.Argument(0) SparkPlan plan) { | ||
| if (planInfo.metadata().size() == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By using the existing metadata on the DataSourceScanExec nodes, we open ourselves to a bit of inconsistency in the JSON parsing:
"meta": {
"Format": "Parquet",
"Batched": true,
...,
"DataFilters": "[CASE WHEN PULocationID#28 IN (236,132,161) THEN true ELSE isnotnull(PULocationID#28) END]"
},
Specifically the lists are not quoted & escaped, which means when we read out the field it's treated as a string rather than a JSON native array. Ideally we would parse this ourselves and upsert it so we can control that formatting, but obviously there's a risk of the parsing going wrong and impacting something that actually uses the field. Leaning slightly towards keeping the formatting as-is in favour of not touching existing fields but happy to hear any other thoughts on this...
| // An extension of how Spark translates `SparkPlan`s to `SparkPlanInfo`, see here: | ||
| // https://github.com/apache/spark/blob/v3.5.0/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlanInfo.scala#L54 | ||
| public class Spark213PlanUtils extends AbstractSparkPlanUtils { | ||
| public Map<String, String> extractPlanProduct(TreeNode plan) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the OpenLineage connector we had a special facet for storing serialized LogicalPlan of the query. This was the most problematic feature we ever had. Because the plan can contain everything. For example, if a user creates in memory few gigabyte dataframe, then this becomes a node in a logical plan. And OpenLineage connector tried to serliaze it and failed the whole Spark driver.
This PR seems to be doing same thing for the physical plan. I think we shouldn't serialize the object when we don't know what's inside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chatted about this over a call, summarizing for posterity:
- Worth clarifying that this function does not traverse the tree itself; we leave that up to Spark because we instrument the recursive
fromSparkPlanmethod - We should avoid serializing anything we don't know about arbitrarily, especially using
toString(). Since we are taking the full product of theTreeNodewe could get some enormous structure (e.g. improbable, but maybe an array of all the data) andtoString()would then attempt to serialize all of that data- Instead we should lean solely on
simpleString()which is safe by default and default to not serializing otherwise. We could then only serialize otherTreeNodes and leave out any unknown or unexpected data structures - With this change it would even be safe to parse the child
QueryPlannodes because it would no longer output the long physical plan, and instead print the one line string
- Instead we should lean solely on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // Parse any nested objects to a standard form that ignores the keys | ||
| // Right now we do this by just asserting the key set and none of the values | ||
| static Object parseNestedMetaObject(Object value) { | ||
| if (value instanceof Map) { | ||
| return value.keySet() | ||
| } else { | ||
| return value | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was driving me nuts - there must be a better way to accomplish this without a ton of additional code... The issue is that for the Spark32 suite of tests, the expectations for the meta fields use named keys, but when we run the tests using Scala 2.12 we expect those to all show up as _dd.unknown_key.*. I added a (not great) way around that in assertSQLPlanEquals, which worked fine until we started getting nested maps that can have unknown keys. e.g.:
"meta": {
"_dd.unparsed" : "any",
"outputPartitioning" : {
"HashPartitioning" : {
"numPartitions" : 2,
"expressions" : [ "string_col#28" ]
}
},
"shuffleOrigin" : "ENSURE_REQUIREMENTS"
},
Where the numPartitions and expressions keys would show up as _dd.unknown_key.* in Scala 2.12. Initially I went for a recursive approach but that ended up feeling very bloated, so I abandoned it in favour of a subpar keyset check (i.e. only check that HashPartitioning exists in the map).
No false impressions that this is any good - let me know if there's a better way I'm missing, if just the key check is okay (only applies to the test suite running Scala 2.12/Spark 3.2.0, the other two suites compare everything as expected), or if we just have to put up with the recursive approach...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the approach to compare lists of values instead of whatever I had put before - a bit cleaner and simpler to follow. Has its own downsides (e.g. not perfect comparisons as some stable keys are eliminated, and the containsAll comparison can be fooled) but at least it attempts to compare values and is much easier to maintain. Given it's on an older version of Scala that will no longer be supported for new Spark versions, I think this should probably be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went through the first round of reading and left some comments.
Pls let me know do you think about it.
...entation/spark/src/main/java/datadog/trace/instrumentation/spark/AbstractSparkPlanUtils.java
Outdated
Show resolved
Hide resolved
...ion/spark/src/main/java/datadog/trace/instrumentation/spark/AbstractSparkPlanSerializer.java
Outdated
Show resolved
Hide resolved
...rk/spark_2.12/src/main/java/datadog/trace/instrumentation/spark/Spark212Instrumentation.java
Show resolved
Hide resolved
...ion/spark/src/main/java/datadog/trace/instrumentation/spark/AbstractSparkPlanSerializer.java
Show resolved
Hide resolved
...on/spark/src/main/java/datadog/trace/instrumentation/spark/AbstractDatadogSparkListener.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks good to me. My primary concern is naturally to make sure this won't cause problems on any spark version nor any physical plans the job is processing.
I think PR does well to achieve this:
- feature is going to be rolled out first to users which explicitly turn it on,
- it serializes only known nodes (serializing unknown nodes is a common pitfall)
- serializer is limited on recursion depth and max collection sizes
- code introduced depends in a minimal way on Spark classes and methods, making it resilient to future updates on Spark side.
Few minor comments added. Happy to approve the PR once they're resolved.
...rk/spark_2.12/src/main/java/datadog/trace/instrumentation/spark/Spark212Instrumentation.java
Outdated
Show resolved
Hide resolved
| assert res.toString() == "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49]" | ||
| } | ||
|
|
||
| def "unknown objects should return null"() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for creating a test for this. I think this is really important.
| // in Spark v3+, the signature of `simpleString` includes an int parameter for `maxFields` | ||
| return TreeNode.class | ||
| .getDeclaredMethod("simpleString", new Class[] {int.class}) | ||
| .invoke(value, MAX_LENGTH) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls make sure this doesn't throw NullPointerException in case getDeclaredMethod returns null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think that's true! Based on the signature of getDeclaredMethod it looks like we should expect NoSuchMethodException in that case:
public Method getDeclaredMethod(String name, Class<?>... parameterTypes) throws NoSuchMethodException, SecurityException
I've added NullPointerException to the catch just in case, though. 5527ad0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just kidding, the spotbugs job did not like that - reverted that change. I'm fairly confident based on the signature & impl that we should only get NoSuchMethodException, though, and not NullPointerException. Let me know if we'd still like to do a more explicit null check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If invoke returns null, our code will call toString on null causing NullPointerException.
Let me know if this is possible or never going to happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, understood - you're right, I was looking at the wrong call. Updated to be an explicit cast. de336b9 (#9783)
...ion/spark/src/main/java/datadog/trace/instrumentation/spark/AbstractSparkPlanSerializer.java
Outdated
Show resolved
Hide resolved
This reverts commit 5527ad0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left first comments. I will continue the review later and let @mhlidd do the full review 😉
| args.$plus$plus( | ||
| JavaConverters.mapAsScalaMap(planUtils.extractFormattedProduct(plan))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❔ question: Quick question: is this for creating a copy of the map?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be frank, I was struggling with this a lot - I was trying to convert scala.collection.mutable.Map to scala.collection.immutable.Map, but I didn't quite know how to do those with a lot of the Scala implicits. Updated now to use toMap instead (figured out how to get the <:< implicit sorted properly). Let me know if this looks better!
| public class Spark212PlanSerializer extends AbstractSparkPlanSerializer { | ||
| @Override | ||
| public String getKey(int idx, TreeNode node) { | ||
| return String.format("_dd.unknown_key.%d", idx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❔ question: Would String format be less expensive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you meant concatenation here 😄 updated!
| private final String SPARK_PKG_NAME = "org.apache.spark"; | ||
|
|
||
| private final Set<String> SAFE_PARSE_TRAVERSE = | ||
| new HashSet<>(Arrays.asList(SPARK_PKG_NAME + ".sql.catalyst.plans.physical.Partitioning")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎯 suggestion: Could we use Collections.singleton() rather than Arrays.asList() to avoid array allocation if we only need a Collection<String>?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated! Out of curiosity, are there better ways to declare the other multi-object sets as well, or is a HashSet as good as it gets?
| planInfo.simpleString(), | ||
| planInfo.children(), | ||
| HashMap.from( | ||
| JavaConverters.asScala(planUtils.extractFormattedProduct(plan)).toList()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| JavaConverters.asScala(planUtils.extractFormattedProduct(plan)).toList()), | |
| JavaConverters.asScala(planUtils.extractFormattedProduct(plan))), |
Do we need to convert to a List first before converting to a HashMap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, updated
| args.$plus$plus( | ||
| JavaConverters.mapAsScalaMap(planUtils.extractFormattedProduct(plan))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need args here? It seems like it would always be an empty map, so there isn't a need to concatenate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment from above:
To be frank, I was struggling with this a lot - I was trying to convert
scala.collection.mutable.Maptoscala.collection.immutable.Map, but I didn't quite know how to do those with a lot of the Scala implicits. Updated now to usetoMapinstead (figured out how to get the<:<implicit sorted properly). Let me know if this looks better!
| protected static assertStringSQLPlanSubset(String expectedString, String actualString) { | ||
| System.err.println("Checking if expected $expectedString SQL plan is a super set of $actualString") | ||
|
|
||
| protected static assertStringSQLPlanSubset(String expectedString, String actualString, String name) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to create a Util class that stores all Spark assertions? This way this test classes can be separated from the assertion definitions that are used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I know what you mean, but would you have an example of a util class that's similar to what you're looking for? My assumption is this would be useful if we decide to swap out the assertion framework used?
| public static final String DATA_JOBS_PARSE_SPARK_PLAN_ENABLED = | ||
| "data.jobs.parse_spark_plan.enabled"; | ||
| public static final String DATA_JOBS_EXPERIMENTAL_FEATURES_ENABLED = | ||
| "data.jobs.experimental_features.enabled"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can these be added to metadata/supported-configurations.json and documented in the Feature Parity Dashboard? I added some docs about this recently that can be referenced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added (link), thanks for mentioning. I forgot about the env vars - would it be correct to assume that the env var name (e.g. DD_DATA_JOBS_PARSE_SPARK_PLAN_ENABLED) is inferred by the tracer when mapping it to the actual config in Config.java? Just curious since we don't explicitly define the env var keys anywhere else
…ormat, add FF to supported-configurations.json properly
What Does This Do
fromSparkPlanfunction to:planparameter into a map of String propertiesmetafield of returnedSparkPlanInfowith those propertiesSpark21XPlanUtilsclass with aextractPlanProductmethod that parses aSparkPlanobject and returns the properties as a <String, String> mapAbstractSparkPlanUtilsclass with aparsePlanProductmethod that parses the various Objects extracted byextractPlanProductto return a comprehensible string representationSpark21XPlanUtilstoJsonfunction inSparkSQLUtilsto write a JSON object if possible, otherwise just write a stringdd.data.jobs.experimental_features.enabled: meant to gate all experimental features before we GA, we should leave this on by default for all internal usersdd.data.jobs.parse_spark_plan.enabled: meant to gate this feature specificallyMotivation
The SparkPlan houses additional details about its execution that is useful to visualize for operators to use. Extract these into spans so they can be ingested.
Additional Notes
This PR leverages the existing
metafield in theSparkPlanInfoclass. This should be safe as we don't overwrite the field if any data exists, and it is currently only used forScanExecnode details. Furthermore since this class appears to be primarily intended as an abstraction for informational purposes, any faulty updates to the object shouldn't result in any breaking issues.Also note that we use the
ProductAPI to obtain the key names (usingproductElementName), however this was only made available in Scala 2.13. As a result the Scala 2.12 instrumentation uses arbitrary_dd.unknown_key.Xnames for the keys, so the values can at least be extracted.Worth mentioning that this PR does not introduce traversal of the physical plan itself into the tracer - this is left to Spark itself. This is because the recursive
fromSparkPlanmethod is instrumented, meaning as each node is built the tracer is invoked to parse it, and we expressly filter out any potentialQueryPlannodes when performing the parsing.Contributor Checklist
type:and (comp:orinst:) labels in addition to any useful labelsclose,fixor any linking keywords when referencing an issue.Use
solvesinstead, and assign the PR milestone to the issueJira ticket: DJM-974