-
Notifications
You must be signed in to change notification settings - Fork 1k
Small Tags performance improvements #6185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
a268978
to
1e8788f
Compare
tags[j++] = tags[i]; | ||
current = next; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've extracted dedup to a separate method to see that in isolation i can reduce ~1000 ns to ~750 ns on workstation by this (not sure how many elements i was feeding in, either 64 or 128). I haven't tested explicitly the same above, but both can be dropped if you feel this is too much.
micrometer-core/src/main/java/io/micrometer/core/instrument/Tags.java
Outdated
Show resolved
Hide resolved
return toTags(tagsCollection.toArray(EMPTY_TAG_ARRAY)); | ||
} | ||
else if (emptyIterable(tags)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emptyIterable was used more widely in a version that hasn't reached publication, maybe we want to get rid of it completely. There are some repeated checks, compiler should be able to pick up invariants during inlining, but i've never explicitly checked.
return Tags.empty(); | ||
} | ||
else if (tags instanceof Tags) { | ||
if (tags instanceof Tags) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since EMPTY is instanceof Tags, it will fall in this branch as well. Instanceof requires one extra memory lookup though, so idk.
Null is also recognized only at the emptyIterable stage, but this shouldn't bring additional cost, i don't expect JIT to try to check for null thrice
} | ||
if (Objects.equals(this, other)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Object.equals is basically a null-safe version of calling equals directly, and the first thing this code was doing is an access to other
that isn't possible in the null case.
1e8788f
to
a574645
Compare
Signed-off-by: etki <[email protected]>
a574645
to
c28c6a6
Compare
This is the tiniest performance bump born out of the same story as #6113. There are few changes that benefit insignificantly (single digit percent) - it's completely OK to say "we'd want to keep things as we have here" (i'm only having fun and getting some extra experience here). It's easier to annotate specific code parts, so you will find individual comments below.
The performance was evaluated with benchmarks from #6174, OpenJDK 21.0.2 and Intel N100 fixed at 2GHz. Since there is different processing for different subtypes of Iterable, a parameter matrix with different distributions of these implementations is introduced, when category is
absent
, then all implementations are uniformly distributed, and when category is X, it means that 90% of input is X, and the rest is evenly divided between other implementations. Every instance contains 0-68 tags (a uniformly selected number), with the exception of null. Please also note thatset
uses HashSet, which should be slower than JDK9+ Set.of(). What is used by the real clients more (Set.of() or HashSet) is unknown to me.Concat: common case, any implementation equiprobable, all standard counters
Concat: matrix
3453.615 ± 12.311
3332.854 ± 15.092
1553.365 ± 8.611
1503.860 ± 7.085
2680.919 ± 15.952
2627.852 ± 12.536
3257.486 ± 8.496
3125.463 ± 14.789
7297.398 ± 12.989
7078.009 ± 6.843
3427.177 ± 18.206
3313.541 ± 7.130
1476.249 ± 2.349
1438.450 ± 2.499
194.101 ± 1.208
193.595 ± 1.029
310.539 ± 1.333
294.154 ± 1.905
838.107 ± 1.863
843.229 ± 2.546
4835.388 ± 4.700
4741.637 ± 11.517
1477.644 ± 3.676
1421.480 ± 3.933
2452.261 ± 13.277
2398.422 ± 12.465
335.398 ± 2.342
310.834 ± 2.399
1515.396 ± 3.010
1473.454 ± 7.279
2111.102 ± 6.931
2019.852 ± 2.921
6109.320 ± 13.158
5985.968 ± 11.379
2458.731 ± 5.363
2414.234 ± 11.044
3033.881 ± 4.737
3003.112 ± 14.427
845.153 ± 2.803
849.731 ± 1.805
2067.637 ± 2.247
2023.014 ± 5.055
2634.236 ± 5.456
2586.595 ± 5.508
6802.838 ± 10.706
6557.061 ± 16.576
3054.109 ± 7.344
2915.368 ± 8.865
7288.231 ± 11.302
7101.247 ± 16.265
4811.101 ± 8.812
4657.866 ± 4.688
6104.217 ± 7.143
5942.757 ± 11.124
6679.737 ± 8.977
6518.054 ± 12.780
10592.943 ± 24.444
10396.221 ± 19.124
7270.077 ± 6.210
7126.497 ± 23.751
3440.212 ± 13.319
3315.371 ± 10.643
1568.332 ± 7.186
1490.758 ± 3.835
2664.903 ± 6.252
2562.616 ± 15.844
3212.028 ± 6.383
3151.855 ± 14.485
7238.005 ± 14.185
7120.775 ± 18.257
3493.186 ± 11.978
3303.420 ± 11.901
Of: common case, all implementations equiprobable, standard counters
Of: distributions
Not sure what's going on with lsit here. Needs just a more more detailed look, but i'm not sure if i'll ever have the time.
1252.774 ± 2.228
1212.606 ± 3.810
73.245 ± 0.453
70.854 ± 0.376
88.168 ± 0.343
69.111 ± 0.371
589.633 ± 1.413
603.960 ± 1.753
4578.833 ± 9.501
4451.159 ± 9.031
1255.498 ± 3.867
1226.269 ± 4.025