-
Notifications
You must be signed in to change notification settings - Fork 15
Handling exceptions from user-supplied lambdas in GraphRequestAction #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
GraphRequestAction calls `buildFromSession`. There's at least one case where this can end up calling a user-supplied lambda -- `executeGraphFluent` / `GraphFluentStatementFromScalaLambda`. If this user-supplied lambda throws an exception, then the exception propagates out of the sendQuery method uncaught, without calling `next ! <session>` to keep the Gatling chain running. From a user POV, this manifested as all tasks quickly becoming active (because `sendQuery` reaches the exception-throwing line almost immediately for each request), but no tasks becoming done (because the session wasn't passed to the `next` reference defined in Gatling's `ChainableAction`).
This test simulates the case where a user provides a lambda that generates a GraphStatement (as with one of the `executeGraphFluent` overloads), but that lambda throws an exception instead of returning normally. The session-chain-checking portion of the test is original, but the trailing assertions related to logging are adapted from a very similar block of existing code in CqlRequestActionSpec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into this!
next ! session.markAsFailed | ||
throw e | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You wrapped the call dseAttributes.statement.buildFromSession(session)
with a try catch. But according to the type system, this call should return a Validation[GraphStatement]
. A Validation
is a Gatling structure that represents either a Success
or Failure
(http://gatling.io/docs/current/session/validation/). So really, this case of an uncaught exception should have been handled before we return to GraphRequestAction.
I think that the fix should be located in DseGraphStatements.scala:54
instead of here. Proposition: replace L54 in DseGraphStatements
by safely()(lambda(gatlingSession).success)
. This will use a built-in Gatling function that does exactly what you want, while also satisfying the type system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the insight.
Gatling's safely
is a lot more concise and idiomatic than try-catch
. There's one thing that makes me uneasy about it though -- I don't see a way to log the stacktrace of the exception caught by safely
before it goes out of scope. For a user-provided lambda, there could be cases where the exception stacktrace is useful, potentially significantly more useful than the exception type/message (e.g. NPE).
I just pushed a new commit that moves exception handling down to DseGraphStatements.scala:54 as directed, but uses Try { ... } match { ... }
instead of safely
. It's now similar to some of its peer implementations inside DseGraphStatements.scala
that also already used similar Try { ... } match { ... }
constructs, and I can still see lambda traces in the logs.
What do you think?
Theoretically, pushing down exception handling like this could allow some other implementation of buildFromSession
to induce a hang, if that implementation allows uncaught exceptions to propagate out to the caller; but it looks like some of the other implementations in DseGraphStatements
already go out of their way to prevent that, and I'm assuming your comment means that's the general rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a way to log the stacktrace of the exception caught by safely before it goes out of scope.
That's a good reason to use a try/catch block, indeed. +1 for that approach
Theoretically, pushing down exception handling like this could allow some other implementation of
buildFromSession
to induce a hang
I agree with you. I also wondered whether to keep the extra try/catch in GraphRequestAction, just to be 100% future proof. But I think the way to go is to honor the type system and make it clear that errors are prevented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you. I also wondered whether to keep the extra try/catch in GraphRequestAction, just to be 100% future proof. But I think the way to go is to honor the type system and make it clear that errors are prevented.
Sounds good, I'll stick to the pushdown approach and update the tests to match. Thank you for addressing this.
This commit responds to review feedback from Pierre, moving exception handling from GraphRequestAction.sendQuery's buildFromSession call to one specific implementation of buildFromSession. It eschews safely(...) so that we can log the exception stacktrace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from one nit, I think this PR is ready to be merged. Did you manage to validate the fix more globally on the ecdc project?
// Attempt to generate a graph statement from our parameters, propagating any uncaught exceptions after | ||
// continuing the chain with a failed session | ||
val stmt: Validation[GraphStatement] = | ||
dseAttributes.statement.buildFromSession(session) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comment lines are not accurate anymore here, could you restore the original line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I'll delete that comment
I think so; this is the gist of what I executed:
Before both the initial PR branch push and before my latest push, I checked that the PR's underlying gatling-dse-plugin change both (a) caused the chain to complete, with all tasks promptly transitioning from the active state to the done state and (b) logged each exception with its name, message, and stacktrace. However, my last commit was just to check direction. If the current direction is OK with you, then it's still not quite ready. My first push made Anyway, now that this overall approach has approval, I will delete the spec that I added initially, and maybe I can add a spec elsewhere for |
- Delete a vestigial comment from `GraphRequestAction` (thanks for finding this Pierre) - Delete an obsolete case from `GraphRequestActionSpec`; it was checking that GRA kept the chain alive despite `buildFromSession` exceptions, but since that behavior has been reverted, this spec is now unnecessary (and failing) - Add a new case to `DseGraphStatementSpec`, exercising a sort of equivalent scenario to the one deleted in the last bullet; it checks that `GraphFluentStatementFromScalaLambda.buildFromSession` catches and logs a runtime exception thrown by its inner lambda. This is similar to the case that got deleted, except that some of the assertions were rewritten as matchers in an attempt to be more consistent with the surrounding tests in the same file
I just pushed a comment and test fixup commit. I deleted the outdated I can't think of any other remaining work items that would block a hypothetical merge at this point. Please do let me know if there's anything else I can do to improve the PR -- code changes, squashing the branch, etc. |
GraphRequestAction implements ChainableAction, but there's a specific case where it won't pass the Gatling session forward to the next action in the chain, hanging Gatling. The specific case is when the user supplies a lambda to delay generation of the GraphStatement until GRA is already inside
sendQuery
on its own fixed-sized threadpool, and then that lambda throws an exception.Please feel free to be vicious with the PR review, as I'm well outside my starting point on this issue's debugging trajectory, and not intimately familiar with the components I'm touching here.
The DSP issue for which this PR's branch is named is sort of tangential to the PR, since the issue involves multiple potential problems in different components.