Skip to content

feat: Support custom BigQuery storage api endpoint #1501

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 25 commits into
base: develop
Choose a base branch
from

Conversation

nj1973
Copy link
Contributor

@nj1973 nj1973 commented Apr 11, 2025

Description of changes

The PR adds support for a custom googleapis endpoint for the BigQuery Storage API, in addition to the BigQuery and Spanner APIs that we already support.

I have manually tested this in a project with a PSC and firewall rules to block access to standard googleapis.com endpoints and the changes work. I don't think we can realistically add integration tests for this setup though.

Now I have a project I can test in I will be able to test Secret Manager integration and raise an issue for that too. As we have a customer waiting on these changes I didn't plan to test/include that support in this PR.

Issues to be closed

Closes #1369

Checklist

  • I have followed the CONTRIBUTING Guide.
  • I have commented my code, particularly in hard-to-understand areas
  • I have updated any relevant documentation to reflect my changes, if applicable
  • I have added unit and/or integration tests relevant to my change as needed
  • I have already checked locally that all unit tests and linting are passing (use the tests/local_check.sh script)
  • I have manually executed end-to-end testing (E2E) with the affected databases/engines

nj1973 and others added 19 commits December 15, 2024 17:17
@nj1973
Copy link
Contributor Author

nj1973 commented Apr 11, 2025

/gcbrun

@nj1973 nj1973 marked this pull request as ready for review April 11, 2025 17:05
@nj1973
Copy link
Contributor Author

nj1973 commented Apr 16, 2025

/gcbrun

@sundar-mudupalli-work
Copy link
Collaborator

Neil,

I don't know how this is supposed to work. I am a bit confused.

Sundar Mudupalli

client_info=_create_client_info(application_name),
)
else:
self.client = bigquery_client
Copy link
Collaborator

@sundar-mudupalli-work sundar-mudupalli-work Apr 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

I don't understand how this is going to work. According to the changes in the documentation, the user passes a string as the endpoint. Here, self.client (a python object) representing the connection to BQ is being assigned a string value. That does not sound right.

If an API endpoint is being used, based on BigQuery client code, you need to pass the client_options kwarg an instance of the ClientOptions class while creating a connection. ClientOptions is defined in API Core Client Options

The latest version of Ibis performs an assignment much like what is being proposed above. However, the documentation, says it is a Client (object) from google.cloud.bigquery package.

Based on the above analysis - one way to accomplish this might be to have DVT environment variables for API endpoints for BQ, GCS, Secret Manager and Spanner. If those are set, then we use those values when connecting to the client API's. I have started a section in the doc on running DVT at on-prem where this explanation may fit. This looks like a more complicated change. I would rather be wrong and we have a simple fix - not sure that is possible here.

I am not aware that we use BigQuery storage API. That is only for Streaming reads.

We might be able to test this by running DVT from a laptop with an IAP tunnel to a PSC endpoint for googleapis.com in pso-kokoro-resources project. It may be too complicated to make the test part of the integration suite.

Thanks.

Sundar Mudupalli

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Issues with custom bigquery api endpoint
3 participants