Skip to content

Use vendored calcite 1.40 #35718

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 29, 2025
Merged

Use vendored calcite 1.40 #35718

merged 6 commits into from
Jul 29, 2025

Conversation

Abacn
Copy link
Contributor

@Abacn Abacn commented Jul 28, 2025

Fix #35483, fix #26403

Please add a meaningful description for your change here


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

@Abacn Abacn changed the title Use vendor calcite Use vendored calcite 1.40 Jul 28, 2025
@Abacn Abacn force-pushed the use-vendor-calcite branch from 5aa5472 to 531b5fc Compare July 28, 2025 20:09
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @Abacn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request primarily focuses on upgrading the Apache Calcite library used within the project from version 1.28.0 to 1.40.0. This significant update not only involves widespread changes to import statements and build configurations but also introduces support for new SQL features and syntax. The changes ensure compatibility with the latest Calcite version and prepare the groundwork for leveraging its enhanced capabilities.

Highlights

  • Core Dependency Upgrade: The Apache Calcite vendored library has been upgraded from version 1.28.0 to 1.40.0. This involves removing the old calcite-1_28_0 module and integrating the new calcite-1_40_0 module across the build system and source code.
  • API and Import Path Updates: Numerous Java files within sdks/java/extensions/sql have been updated to reflect the new package structure of the vendored Calcite library, changing import paths from org.apache.beam.vendor.calcite.v1_28_0 to org.apache.beam.vendor.calcite.v1_40_0.
  • New SQL Feature Support: The config.fmpp file, used for SQL parser generation, has been updated to include new SQL keywords and parsing methods (e.g., ARRAY_AGG, DATE_DIFF, PIVOT, UNPIVOT), indicating an expansion of supported SQL features.
  • Build System Adjustments: The buildSrc/src/main/groovy/BeamModulePlugin.groovy and settings.gradle.kts files were modified to correctly reference the new Calcite vendored library and adjust build logic related to publishing.
  • Test Output and Query Syntax Changes: Several SQL line tests (BeamSqlLineIT.java, BeamSqlLineTest.java) required updates to expected timestamp formats (adding microseconds). Additionally, Nexmark SQL queries were adjusted to use backticks around dateTime (e.g., dateTime), likely due to changes in Calcite's identifier or keyword handling.
  • Internal Calcite Integration Fixes: Specific code adjustments were made in CalciteQueryPlanner.java to address rule application behavior (withExpand(true)) and in FilterUtils.java to handle new literal representations and boolean field parsing, adapting to internal changes in Calcite 1.40.0.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This PR upgrades the vendored Calcite dependency from 1.28.0 to 1.40.0. The changes are extensive and touch many files, primarily updating dependency versions, import paths, and adapting to API changes in the new Calcite version. The changes look mostly correct and necessary for the upgrade. I have a few suggestions for minor improvements.

Comment on lines +1839 to 1841
if (isRelease(project)) {
throw new GradleException("Release enabled but " + msg + ".")
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This hunk introduces a logic change that seems unrelated to the Calcite version upgrade. The condition for throwing a GradleException when a required Java version is missing has been changed. While this change might be desirable, including it in a large dependency upgrade PR makes it harder to review and understand its impact. It would be better to extract this logic change into a separate, focused pull request. This will improve clarity and make the history easier to follow.

Comment on lines 166 to +175
line ->
line.replace('import org.apache.calcite.', 'import org.apache.beam.vendor.calcite.v1_28_0.org.apache.calcite.')
line.replace('import org.apache.calcite.', 'import org.apache.beam.vendor.calcite.v1_40_0.org.apache.calcite.')
}
filter{
line ->
line.replace('import static org.apache.calcite.', 'import static org.apache.beam.vendor.calcite.v1_28_0.org.apache.calcite.')
line.replace('import static org.apache.calcite.', 'import static org.apache.beam.vendor.calcite.v1_40_0.org.apache.calcite.')
}
filter{
line ->
line.replace('import com.google.common.', 'import org.apache.beam.vendor.calcite.v1_40_0.com.google.common.')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better readability and a minor performance improvement, these three separate filter closures can be combined into a single one.

  filter { line ->
    line.replace('import org.apache.calcite.', 'import org.apache.beam.vendor.calcite.v1_40_0.org.apache.calcite.')
        .replace('import static org.apache.calcite.', 'import static org.apache.beam.vendor.calcite.v1_40_0.org.apache.calcite.')
        .replace('import com.google.common.', 'import org.apache.beam.vendor.calcite.v1_40_0.com.google.common.')
  }

Comment on lines +338 to +340
@SuppressWarnings("override.return")
@Override
public String getSchema() throws SQLException {
public @Nullable String getSchema() throws SQLException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Adding @Nullable annotation to this method is a good defensive coding practice, as it explicitly documents that this method can return null. However, it is important to ensure that all call sites are prepared to handle a null return value.

Comment on lines +257 to 260
List<Table.Cell<RelNode, ?, Object>> costKeys =
bmq.map.cellSet().stream()
.filter(entry -> entry.getValue() instanceof BeamCostModel)
.filter(entry -> ((BeamCostModel) entry.getValue()).isInfinite())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The type parameter for costKeys is too restrictive. It should be ? instead of List.

Suggested change
List<Table.Cell<RelNode, ?, Object>> costKeys =
bmq.map.cellSet().stream()
.filter(entry -> entry.getValue() instanceof BeamCostModel)
.filter(entry -> ((BeamCostModel) entry.getValue()).isInfinite())
List<Table.Cell<RelNode, ?, Object>> costKeys =

Comment on lines +145 to +146
// Revert the flag flip of CALCITE-3870 which led to missing rules
SqlToRelConverter.Config sqlToRelConfig = SqlToRelConverter.config().withExpand(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This change reverts a flag flip in CALCITE-3870. It would be helpful to add a comment explaining why this reversion is necessary for Beam SQL.

@@ -141,6 +142,9 @@
final SqlOperatorTable opTab0 =
connection.config().fun(SqlOperatorTable.class, SqlStdOperatorTable.instance());

// Revert the flag flip of CALCITE-3870 which led to missing rules
SqlToRelConverter.Config sqlToRelConfig = SqlToRelConverter.config().withExpand(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Consider adding a comment to explain why withExpand(true) is needed here.

Suggested change
SqlToRelConverter.Config sqlToRelConfig = SqlToRelConverter.config().withExpand(true);
// Revert the flag flip of CALCITE-3870 which led to missing rules
SqlToRelConverter.Config sqlToRelConfig = SqlToRelConverter.config().withExpand(true); // Needed for...

@@ -311,7 +312,7 @@ public void testArithmeticOperator() {
.addExpr("c_double + c_bigint", 2.0)
.addExpr("1 - 1", 0)
.addExpr("1.0 - 1", ZERO_1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The result of 1 - 0.0 should be 0.1 instead of 1.0.

Suggested change
.addExpr("1.0 - 1", ZERO_1)
.addExpr("1 - 0.0", ONE_1)

@@ -708,8 +709,8 @@

@Test
@SqlOperatorTests({
@SqlOperatorTest(name = "CHARACTER_LENGTH", kind = "OTHER_FUNCTION"),
@SqlOperatorTest(name = "CHAR_LENGTH", kind = "OTHER_FUNCTION"),
@SqlOperatorTest(name = "CHARACTER_LENGTH", kind = "CHAR_LENGTH"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The kind for CHARACTER_LENGTH and CHAR_LENGTH should be CHAR_LENGTH instead of OTHER_FUNCTION.

Suggested change
@SqlOperatorTest(name = "CHARACTER_LENGTH", kind = "CHAR_LENGTH"),
@SqlOperatorTest(name = "CHARACTER_LENGTH", kind = "CHAR_LENGTH"),

@@ -1129,8 +1130,7 @@
}

@Test
// https://github.com/apache/beam/issues/19001
// @SqlOperatorTest(name = "FLOOR", kind = "FLOOR")
@SqlOperatorTest(name = "FLOOR", kind = "FLOOR")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This test was commented out. It should be uncommented.

Suggested change
@SqlOperatorTest(name = "FLOOR", kind = "FLOOR")
@SqlOperatorTest(name = "FLOOR", kind = "FLOOR")

Comment on lines +1143 to +1144
.addExpr("FLOOR(c_double)", 1.0)
.addExpr("FLOOR(-c_double)", -2.0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Add a test case for FLOOR(-c_double).

Suggested change
.addExpr("FLOOR(c_double)", 1.0)
.addExpr("FLOOR(-c_double)", -2.0);
.addExpr("FLOOR(c_double)", 1.0)
.addExpr("FLOOR(-c_double)", -2.0);

@Abacn Abacn marked this pull request as ready for review July 29, 2025 01:15
Copy link
Contributor

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

Copy link
Contributor

Assigning reviewers:

R: @ahmedabu98 for label java.
R: @damccorm for label build.

Note: If you would like to opt out of this review, comment assign to next reviewer.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

Copy link
Contributor

@damccorm damccorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Had a non-blocking question (feel free to merge)

Copy link
Contributor

@damccorm damccorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@Abacn Abacn merged commit 805b68c into apache:master Jul 29, 2025
33 checks passed
@Abacn Abacn deleted the use-vendor-calcite branch July 29, 2025 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Task]: Upgrade calcite to a more recent version Update vendored calcite to eliminate vulnerability from shaded log4j:1.2.17 and protobuf-java:3.19.2
3 participants