Skip to content

v1/topdown/graphql: Cache GraphQL schema parse results (#5377) #7457

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

robmyersrobmyers
Copy link
Contributor

@robmyersrobmyers robmyersrobmyers commented Mar 20, 2025

This commit stores parsed GraphQL schemas to the cache, which improves the performance of GraphQL operations that parse the schema more than once.

Queries are not cached.

pkg: github.com/open-policy-agent/opa/v1/topdown
cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_string-16	                   15519            100178 ns/op
BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_string-16               371311              3383 ns/op
BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_object-16                            2133            542355 ns/op
BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_object-16                 3471            528579 ns/op
BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_AST_object-16                        7105            193325 ns/op
BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_AST_object-16            66594             18093 ns/op
BenchmarkGraphQLParseSchema/Trivial_Schema_-_string-16                              6429            173773 ns/op
BenchmarkGraphQLParseSchema/Trivial_Schema_with_cache_-_string-16                   6523            170819 ns/op
BenchmarkGraphQLParseQuery/Trivial_Query_-_string-16                               16352             72777 ns/op
BenchmarkGraphQLParseQuery/Trivial_Query_with_cache_-_string-16                    16083             73548 ns/op
BenchmarkGraphQLIsValid/Trivial_Schema_-_string-16                                 14320             83589 ns/op
BenchmarkGraphQLIsValid/Trivial_Schema_with_cache_-_string-16                      71486             15463 ns/op
BenchmarkGraphQLParse/Trivial_Schema_-_string-16                                    3380            321490 ns/op
BenchmarkGraphQLParse/Trivial_Schema_with_cache_-_string-16                        13909             87633 ns/op
BenchmarkGraphQLParseAndVerify/Trivial_Schema_-_string-16                           3435            327646 ns/op
BenchmarkGraphQLParseAndVerify/Trivial_Schema_with_cache_-_string-16               13844             85213 ns/op
PASS
ok      github.com/open-policy-agent/opa/v1/topdown     112.465s

Resolves: #5377

Why the changes in this PR are needed?

GraphQL schema parsing can be an expensive operation, so caching can help speed things up.

What are the changes in this PR?

This PR leverages the InterQueryBuiltinValueCache to store parsed GraphQL schema data.

Notes to assist PR review:

The performance improvements are more dramatic on more complex schemas, but the complex schemas were omitted from the included test cases because they are quite large. Those numbers are captured in the issue thread.

Further comments:

The GraphQL builtin code has some repeated patterns which could probably be cleaned up, but I felt that was out of scope for adding a caching layer.

Copy link

netlify bot commented Mar 20, 2025

Deploy Preview for openpolicyagent ready!

Name Link
🔨 Latest commit f9a0b8f
🔍 Latest deploy log https://app.netlify.com/sites/openpolicyagent/deploys/67fe99956cf4cc0008f6e848
😎 Deploy Preview https://deploy-preview-7457--openpolicyagent.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@robmyersrobmyers robmyersrobmyers force-pushed the 5377_cache_gql_parse branch 2 times, most recently from 77cd7a2 to c8871c7 Compare March 20, 2025 16:20
Copy link
Contributor

@johanfylling johanfylling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this! 😃

Just a couple of questions/comments.


if k, keyOk := cacheKeyWithPrefix(bctx, operands, "gql_schema_ast"); keyOk {
key = k
if val, ok := bctx.InterQueryBuiltinValueCache.Get(ast.StringTerm(key).Value); ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend using a named value cache here. The benefits are:

  • Caching is isolated to the GraphQL built-in, and entries aren't "competing" with other built-ins
  • No need to prefix the key, for above reason (looks like we need internal differentiation, though)
  • Named caches can be individually configured (complete with individual default settings)

See the io.jwt.* built-ins for inspiration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robmyersrobmyers, what's your thoughts about using a named value cache? Do you want us to help making those changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working on it now, but seem to be having trouble initializing the cache. Right now I'm looking at 3 different named value caches for the different types, but that seems a bit inelegant, so maybe I should create a single named cache and use the cache key to differentiate the different types? Let me know your preference and I'll try and get it in close shape before I get your help to take it over the finish line. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be working; feedback welcome. Not sure people will love exposing 3 different named cache configurations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three separate configuration options might be overkill, like you say 👍. A single named cache with differentiating cache keys for the "sub-types" sounds like a good solution to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good - I'll resurrect the cache key stuff and consolidate to one named cache.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ready - please take a look and let me know if we missed anything.

Updated bench results with a few less trivial inputs:

    BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_string-10                           23740             51418 ns/op
    BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_string-10               965624              1187 ns/op
    BenchmarkGraphQLSchemaIsValid/Schema_w/_1000_types_-_string-10                       751           1578718 ns/op
    BenchmarkGraphQLSchemaIsValid/Schema_w/_1000_types_with_cache_-_string-10          33801             34915 ns/op
    BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_AST_object-10                       17491             67845 ns/op
    BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_AST_object-10           124808              9801 ns/op
    BenchmarkGraphQLParseSchema/Trivial_Schema_-_string-10                              9914            116169 ns/op
    BenchmarkGraphQLParseSchema/Trivial_Schema_with_cache_-_string-10                 678207              1640 ns/op
    BenchmarkGraphQLParseQuery/Trivial_Query_-_string-10                               26994             44328 ns/op
    BenchmarkGraphQLParseQuery/Trivial_Query_with_cache_-_string-10                    26922             44386 ns/op
    BenchmarkGraphQLIsValid/Trivial_Schema_-_string-10                                 21441             55487 ns/op
    BenchmarkGraphQLIsValid/Trivial_Schema_with_cache_-_string-10                     152800              7384 ns/op
    BenchmarkGraphQLIsValid/Schema_w/_1000_types_-_string-10                             758           1568036 ns/op
    BenchmarkGraphQLIsValid/Schema_w/_1000_types_with_cache_-_string-10                28108             43051 ns/op
    BenchmarkGraphQLParse/Trivial_Schema_-_string-10                                    5119            216610 ns/op
    BenchmarkGraphQLParse/Trivial_Schema_with_cache_-_string-10                        24186             49551 ns/op
    BenchmarkGraphQLParseAndVerify/Trivial_Schema_-_string-10                           5241            217419 ns/op
    BenchmarkGraphQLParseAndVerify/Trivial_Schema_with_cache_-_string-10               23911             50004 ns/op

@robmyersrobmyers robmyersrobmyers force-pushed the 5377_cache_gql_parse branch 4 times, most recently from 13dbd37 to 32ad5f0 Compare April 12, 2025 06:56
@robmyersrobmyers robmyersrobmyers force-pushed the 5377_cache_gql_parse branch 2 times, most recently from fad8810 to d88bd17 Compare April 14, 2025 14:44
…gent#5377)

This commit stores parsed GraphQL schemas to the cache, which improves
the performance of GraphQL operations that parse the schema more than once.

Queries are not cached.

BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_string-10	         	   23740	     51418 ns/op
BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_string-10         	  965624	      1187 ns/op
BenchmarkGraphQLSchemaIsValid/Schema_w/_1000_types_-_string-10              	     751	   1578718 ns/op
BenchmarkGraphQLSchemaIsValid/Schema_w/_1000_types_with_cache_-_string-10   	   33801	     34915 ns/op
BenchmarkGraphQLSchemaIsValid/Trivial_Schema_-_AST_object-10                	   17491	     67845 ns/op
BenchmarkGraphQLSchemaIsValid/Trivial_Schema_with_cache_-_AST_object-10     	  124808	      9801 ns/op
BenchmarkGraphQLParseSchema/Trivial_Schema_-_string-10                      	    9914	    116169 ns/op
BenchmarkGraphQLParseSchema/Trivial_Schema_with_cache_-_string-10           	  678207	      1640 ns/op
BenchmarkGraphQLParseQuery/Trivial_Query_-_string-10                        	   26994	     44328 ns/op
BenchmarkGraphQLParseQuery/Trivial_Query_with_cache_-_string-10             	   26922	     44386 ns/op
BenchmarkGraphQLIsValid/Trivial_Schema_-_string-10                          	   21441	     55487 ns/op
BenchmarkGraphQLIsValid/Trivial_Schema_with_cache_-_string-10               	  152800	      7384 ns/op
BenchmarkGraphQLIsValid/Schema_w/_1000_types_-_string-10                    	     758	   1568036 ns/op
BenchmarkGraphQLIsValid/Schema_w/_1000_types_with_cache_-_string-10         	   28108	     43051 ns/op
BenchmarkGraphQLParse/Trivial_Schema_-_string-10                            	    5119	    216610 ns/op
BenchmarkGraphQLParse/Trivial_Schema_with_cache_-_string-10                 	   24186	     49551 ns/op
BenchmarkGraphQLParseAndVerify/Trivial_Schema_-_string-10                   	    5241	    217419 ns/op
BenchmarkGraphQLParseAndVerify/Trivial_Schema_with_cache_-_string-10        	   23911	     50004 ns/op

Resolves: open-policy-agent#5377

Signed-off-by: Rob Myers <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Possible to cache graphql.parse() results?
2 participants