Return hashes with correctly ordered keys #5315

rmosolgo · 2025-04-02T20:25:17Z

Trying again at #5011 and #4252, replaces #5060

GraphQL-Ruby happens to return correctly-ordered results in simple cases because of Ruby's ordered Hash (results are inserted in the order they're encountered in the query). But a few things can cause it to be out-of-order:

Runtime directives (but not @skip or @include) are evaluated separately then merged back in -- merging causes mis-ordering
Dataloaded values are written to the hash after non-Dataloaded values, therefore mis-ordered

lazy_resolve(...) values (like GraphQL-Batch's promises) don't have this problem because they're written immediately. But I'm trying to rebuild lazy resolve on top of dataloader to simplify execution code, so they might end up having this problem.

TODO:

Make sure this performance is as good as it can be, especially minimal allocations on hot paths
- Can it detect when order might be wrong, and only re-sort in that case?
  - Yes, this worked out alright
- Can it somehow re-order only once (currently reorders on every write which is silly)
  - No, it turns out this code doesn't really do what I thought -- it may well run before others have run. I think it could be removed altogether:
    
    graphql-ruby/lib/graphql/execution/interpreter/runtime.rb
    
    Line 221 in 96aeb2b
    
    if finished_jobs == enqueued_jobs
- Use a first-class ordered data structure (Array)? But current implementation depends on mutable hashes.
  - I think this is worth exploring down the line. The trick will be maintaining compatibility.

rmosolgo · 2025-04-03T14:14:55Z

Without any optimizing or really even thinking about perf:

  ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [x86_64-darwin22]

  bench:profile_large_result
-                          12.617 (±15.9%) i/s   (79.26 ms/i) -    123.000 in  10.056377s
+                          11.212 (±17.8%) i/s   (89.19 ms/i) -    110.000 in  10.048726s
- Total allocated: 9225024 bytes (74426 objects)
+ Total allocated: 10553104 bytes (78428 objects)

  bench:profile_small_result 

-                         989.907 (± 3.8%) i/s    (1.01 ms/i) -      9.951k in  10.068924s
+                         829.929 (± 5.1%) i/s    (1.20 ms/i) -      8.330k in  10.063511s
- Total allocated: 119752 bytes (1131 objects)
+ Total allocated: 133832 bytes (1273 objects)

  EAGER=1 bench:profile_large_result 

-                          37.763 (± 5.3%) i/s   (26.48 ms/i) -    378.000 in  10.042316s
+                         32.928 (± 9.1%) i/s   (30.37 ms/i) -    328.000 in  10.036932s
- Total allocated: 3588128 bytes (18423 objects)
+ Total allocated: 3828208 bytes (20425 objects)

So, 1-4 allocations per object in the response, ~10-20% slowdown. Gotta do better!

rmosolgo · 2025-04-03T14:47:20Z

Some cleanup got me a long way there, objects-wise:

  ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [x86_64-darwin22]

  bench:profile_large_result
-                          12.617 (±15.9%) i/s   (79.26 ms/i) -    123.000 in  10.056377s
+                          11.369 (±17.6%) i/s   (87.96 ms/i) -    111.000 in  10.050506s
- Total allocated: 9225024 bytes (74426 objects)
+ Total allocated: 9873064 bytes (75427 objects)


  bench:profile_small_result 

-                         989.907 (± 3.8%) i/s    (1.01 ms/i) -      9.951k in  10.068924s
+                         895.174 (± 5.4%) i/s    (1.12 ms/i) -      8.944k in  10.027436s
- Total allocated: 119752 bytes (1131 objects)
+ Total allocated: 126792 bytes (1187 objects)

  EAGER=1 bench:profile_large_result 

-                          37.763 (± 5.3%) i/s   (26.48 ms/i) -    378.000 in  10.042316s
+                         34.633 (± 8.7%) i/s   (28.87 ms/i) -    344.000 in  10.011065s
- Total allocated: 3588128 bytes (18423 objects)
+ Total allocated: 3788168 bytes (19424 objects)

Objects-wise, I'm down to one object per object in the response, which is the list of ordered fields. I'm going to need that, somehow, so I if I'm going to make up the difference, I'll have to eliminate an object somewhere else.

Runtime is still not good enough though.

rmosolgo · 2025-04-03T19:43:18Z

A few changes brought the runtime a lot closer:

-                          12.617 (±15.9%) i/s   (79.26 ms/i) -    123.000 in  10.056377s
+                          12.445 (±16.1%) i/s   (80.35 ms/i) -    122.000 in  10.071001s
- Total allocated: 9225024 bytes (74426 objects)
+ Total allocated: 9873064 bytes (75427 objects)


  bench:profile_small_result 

-                         989.907 (± 3.8%) i/s    (1.01 ms/i) -      9.951k in  10.068924s
+                         969.758 (± 2.9%) i/s    (1.03 ms/i) -      9.696k in  10.007819s
- Total allocated: 119752 bytes (1131 objects)
+ Total allocated: 126792 bytes (1187 objects)

  EAGER=1 bench:profile_large_result 

-                          37.763 (± 5.3%) i/s   (26.48 ms/i) -    378.000 in  10.042316s
+                          33.009 (± 6.1%) i/s   (30.29 ms/i) -    330.000 in  10.050151s
- Total allocated: 3588128 bytes (18423 objects)
+ Total allocated: 3788168 bytes (19424 objects)

But the last benchmark is still a bummer -- more than 10% slower.

That benchmark is the least realistic: no application code, no lazy resolution, just GraphQL overhead.

I also ran EAGER=1 bench:profile_small_result and that's almost identical:

-                           1.241k (± 4.4%) i/s  (805.69 μs/i) -     12.416k in  10.024731s
+                           1.251k (± 3.9%) i/s  (799.58 μs/i) -     12.500k in  10.009851s

So I'm not sure why this one is showing slowness 🤔

rmosolgo · 2025-04-07T14:32:36Z

I think I'm gonna go for it -- it shows a regression in that one benchmark, but I think that's because it's a worst-case scenario for this change. There's so little else going on that the new overhead really stands out.

rmosolgo added 2 commits April 2, 2025 16:23

Basic implementation for correct key order

f10f835

Remove focus

b5ebd4f

rmosolgo mentioned this pull request Apr 2, 2025

Return ordered result hashes #5060

Closed

rmosolgo added 3 commits April 3, 2025 06:53

Add more order assertions, improve ordered keys implementation

2747589

Update more specs to check for order, update tests

2c849a8

fix specs

c1fab40

rmosolgo added 2 commits April 3, 2025 10:17

Fix gather_selections empty value initialization

8c28e94

Fix result order without intermediary object

08bebe1

Use index lookup to check

527eab1

rmosolgo merged commit 6d473f5 into master Apr 7, 2025
15 checks passed

rmosolgo deleted the dataloader-result-order branch April 7, 2025 14:34

rmosolgo mentioned this pull request Apr 7, 2025

Return result keys in the same order as the request #5011

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Return hashes with correctly ordered keys #5315

Return hashes with correctly ordered keys #5315

Uh oh!

rmosolgo commented Apr 2, 2025 •

edited

Loading

Uh oh!

rmosolgo commented Apr 3, 2025

Uh oh!

rmosolgo commented Apr 3, 2025 •

edited

Loading

Uh oh!

rmosolgo commented Apr 3, 2025

Uh oh!

rmosolgo commented Apr 7, 2025

Uh oh!

Uh oh!

Uh oh!

Return hashes with correctly ordered keys #5315

Return hashes with correctly ordered keys #5315

Uh oh!

Conversation

rmosolgo commented Apr 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rmosolgo commented Apr 3, 2025

Uh oh!

rmosolgo commented Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rmosolgo commented Apr 3, 2025

Uh oh!

rmosolgo commented Apr 7, 2025

Uh oh!

Uh oh!

Uh oh!

rmosolgo commented Apr 2, 2025 •

edited

Loading

rmosolgo commented Apr 3, 2025 •

edited

Loading