Skip to content

Return hashes with correctly ordered keys #5315

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 7, 2025
Merged

Conversation

rmosolgo
Copy link
Owner

@rmosolgo rmosolgo commented Apr 2, 2025

Trying again at #5011 and #4252, replaces #5060

GraphQL-Ruby happens to return correctly-ordered results in simple cases because of Ruby's ordered Hash (results are inserted in the order they're encountered in the query). But a few things can cause it to be out-of-order:

  • Runtime directives (but not @skip or @include) are evaluated separately then merged back in -- merging causes mis-ordering
  • Dataloaded values are written to the hash after non-Dataloaded values, therefore mis-ordered

lazy_resolve(...) values (like GraphQL-Batch's promises) don't have this problem because they're written immediately. But I'm trying to rebuild lazy resolve on top of dataloader to simplify execution code, so they might end up having this problem.

TODO:

  • Make sure this performance is as good as it can be, especially minimal allocations on hot paths
    • Can it detect when order might be wrong, and only re-sort in that case?
      • Yes, this worked out alright
    • Can it somehow re-order only once (currently reorders on every write which is silly)
    • Use a first-class ordered data structure (Array)? But current implementation depends on mutable hashes.
      • I think this is worth exploring down the line. The trick will be maintaining compatibility.

@rmosolgo
Copy link
Owner Author

rmosolgo commented Apr 3, 2025

Without any optimizing or really even thinking about perf:

  ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [x86_64-darwin22]

  bench:profile_large_result
-                          12.617 (±15.9%) i/s   (79.26 ms/i) -    123.000 in  10.056377s
+                          11.212 (±17.8%) i/s   (89.19 ms/i) -    110.000 in  10.048726s
- Total allocated: 9225024 bytes (74426 objects)
+ Total allocated: 10553104 bytes (78428 objects)

  bench:profile_small_result 

-                         989.907 (± 3.8%) i/s    (1.01 ms/i) -      9.951k in  10.068924s
+                         829.929 (± 5.1%) i/s    (1.20 ms/i) -      8.330k in  10.063511s
- Total allocated: 119752 bytes (1131 objects)
+ Total allocated: 133832 bytes (1273 objects)

  EAGER=1 bench:profile_large_result 

-                          37.763 (± 5.3%) i/s   (26.48 ms/i) -    378.000 in  10.042316s
+                         32.928 (± 9.1%) i/s   (30.37 ms/i) -    328.000 in  10.036932s
- Total allocated: 3588128 bytes (18423 objects)
+ Total allocated: 3828208 bytes (20425 objects)

So, 1-4 allocations per object in the response, ~10-20% slowdown. Gotta do better!

@rmosolgo
Copy link
Owner Author

rmosolgo commented Apr 3, 2025

Some cleanup got me a long way there, objects-wise:

  ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [x86_64-darwin22]

  bench:profile_large_result
-                          12.617 (±15.9%) i/s   (79.26 ms/i) -    123.000 in  10.056377s
+                          11.369 (±17.6%) i/s   (87.96 ms/i) -    111.000 in  10.050506s
- Total allocated: 9225024 bytes (74426 objects)
+ Total allocated: 9873064 bytes (75427 objects)


  bench:profile_small_result 

-                         989.907 (± 3.8%) i/s    (1.01 ms/i) -      9.951k in  10.068924s
+                         895.174 (± 5.4%) i/s    (1.12 ms/i) -      8.944k in  10.027436s
- Total allocated: 119752 bytes (1131 objects)
+ Total allocated: 126792 bytes (1187 objects)

  EAGER=1 bench:profile_large_result 

-                          37.763 (± 5.3%) i/s   (26.48 ms/i) -    378.000 in  10.042316s
+                         34.633 (± 8.7%) i/s   (28.87 ms/i) -    344.000 in  10.011065s
- Total allocated: 3588128 bytes (18423 objects)
+ Total allocated: 3788168 bytes (19424 objects)

Objects-wise, I'm down to one object per object in the response, which is the list of ordered fields. I'm going to need that, somehow, so I if I'm going to make up the difference, I'll have to eliminate an object somewhere else.

Runtime is still not good enough though.

@rmosolgo
Copy link
Owner Author

rmosolgo commented Apr 3, 2025

A few changes brought the runtime a lot closer:

-                          12.617 (±15.9%) i/s   (79.26 ms/i) -    123.000 in  10.056377s
+                          12.445 (±16.1%) i/s   (80.35 ms/i) -    122.000 in  10.071001s
- Total allocated: 9225024 bytes (74426 objects)
+ Total allocated: 9873064 bytes (75427 objects)


  bench:profile_small_result 

-                         989.907 (± 3.8%) i/s    (1.01 ms/i) -      9.951k in  10.068924s
+                         969.758 (± 2.9%) i/s    (1.03 ms/i) -      9.696k in  10.007819s
- Total allocated: 119752 bytes (1131 objects)
+ Total allocated: 126792 bytes (1187 objects)

  EAGER=1 bench:profile_large_result 

-                          37.763 (± 5.3%) i/s   (26.48 ms/i) -    378.000 in  10.042316s
+                          33.009 (± 6.1%) i/s   (30.29 ms/i) -    330.000 in  10.050151s
- Total allocated: 3588128 bytes (18423 objects)
+ Total allocated: 3788168 bytes (19424 objects)

But the last benchmark is still a bummer -- more than 10% slower.

That benchmark is the least realistic: no application code, no lazy resolution, just GraphQL overhead.

I also ran EAGER=1 bench:profile_small_result and that's almost identical:

-                           1.241k (± 4.4%) i/s  (805.69 μs/i) -     12.416k in  10.024731s
+                           1.251k (± 3.9%) i/s  (799.58 μs/i) -     12.500k in  10.009851s

So I'm not sure why this one is showing slowness 🤔

@rmosolgo
Copy link
Owner Author

rmosolgo commented Apr 7, 2025

I think I'm gonna go for it -- it shows a regression in that one benchmark, but I think that's because it's a worst-case scenario for this change. There's so little else going on that the new overhead really stands out.

@rmosolgo rmosolgo merged commit 6d473f5 into master Apr 7, 2025
15 checks passed
@rmosolgo rmosolgo deleted the dataloader-result-order branch April 7, 2025 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant