Skip to content

test: reproduce commonly cited timeout error #1390

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 26 commits into
base: main
Choose a base branch
from

Conversation

danieljbruce
Copy link
Contributor

@danieljbruce danieljbruce commented May 6, 2025

Description

The main intent of this PR is to supply a test confirming that users can see the changes introduced by this PR. Primarily, the runQuery test provides an error stack that users can see so that they can identify the root cause of the timeout errors they are seeing. Confirming this behaviour is in place with a test is crucial for solving issues like #1176 going forward.

Impact

This is a crucial step towards solving #1176 giving us the tools we need to explore client library behaviour when the error from the issue occurs.

Testing

This PR only adds tests and no source code changes. The tests are skipped because they fail right now and the goal is to figure out the source code changes that would allow them to pass.

Additional Information

Some related PRs:
googleapis/gax-nodejs#1740
googleapis/gax-nodejs#1650

Next Steps

  • We have a test for UNAVAILABLE on runQuery now, but we should apply the right fixes for DEADLINE_EXCEEDED errors
  • Let's add a test involving transactions

@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: datastore Issues related to the googleapis/nodejs-datastore API. labels May 6, 2025
@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: m Pull request size is medium. labels May 6, 2025
done();
} catch (e) {
done(e);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only real change is to make the test runner throw an error if the script throws an error instead of failing silently

// The error message is based on client library behavior.
assert.strictEqual(
(e as Error).message,
'4 DEADLINE_EXCEEDED: error details: error count: 1',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message should be something like this. We need to do a bit more investigation into what the client should do in this case to determine the exact error message should be, but for now it's fine to backlog this as a TODO.

@@ -0,0 +1,126 @@
// Copyright 2025 Google LLC
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file contains utilities for working with the mock server that include a function for producing a deterministic set of errors and a function for shutting down the mock server.

callback: (arg1: string | null, arg2: {}) => {},
) {
// SET A BREAKPOINT HERE AND EXPLORE `call` TO SEE THE REQUEST.
callback(null, {message: 'Hello'});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This grpcEndpoint function has just been moved inside the only place it is used.

@danieljbruce danieljbruce changed the title 414574369 test error stack end to end test: reproduce commonly cited timeout error May 6, 2025
Comment on lines 62 to 68
* `DEADLINE_EXCEEDED` code (4), a details message indicating the number of
* errors generated so far by this instance, and some metadata.
*
* @returns {ServiceError} A `ServiceError` object representing a simulated
* gRPC error.
*/
generateError() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - without reading the comment, it's actually hard interpret generateError to generate a DEADLINE_EXCEEDED error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I restructured this so that we pass the error code into generateError. This might make the intent of this method more clear.

@@ -0,0 +1,51 @@
// Copyright 2025 Google LLC

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - the test file is also a bit confusing. I think your intention is to differentiate the test between non-retryable error like NOT_FOUND v.s. a DEADLINE_EXCEEDED error.

But the createreadstream v.s. runquery in the name made it confusing as it seems to do with the functionalities.

Copy link
Contributor Author

@danieljbruce danieljbruce May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent is to differentiate between the calls to the lookup grpc endpoint and the runQuery grpc endpoint.

Note that this PR is in a Draft stage. Although the early feedback is appreciated, I think I'm going to make some larger adjustments like removing this file so that we can lock in the progress we have with the UNAVAILABLE test on runQuery. This gives us the confidence that we can see the error stack for these types of calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: datastore Issues related to the googleapis/nodejs-datastore API. size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants