Remove comments from query when parsing SET #4959

jdlawrie · 2025-05-21T12:36:16Z

No description provided.

Signed-off-by: James Lawrie <[email protected]>

gemini-code-assist

Hello @jdlawrie, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

Hello! Gemini here, providing a summary of this pull request. This PR, authored by jdlawrie, focuses on improving the robustness of the SetParser by ensuring that single-line SQL comments (--) are ignored when parsing the input query string. Previously, comments within the query might have interfered with the parsing logic. This change adds a step to explicitly remove these comments before the main parsing process begins, making the parser more resilient to common SQL syntax variations.

Highlights

Comment Handling: The primary change introduces logic to strip single-line SQL comments (lines starting with --) from the input query string within the SetParser::parse1 method.
Parser Robustness: By removing comments, the parser becomes more robust and less likely to encounter errors or unexpected behavior when processing queries that include comments.

Changelog

lib/set_parser.cpp
- Added regular expression replacement to remove single-line SQL comments (--.*$) from the query string before parsing (lines 88-89).

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Comments hide the truth,
Parser sees only code,
Clean string now it reads.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a change to the SetParser::parse1() method to remove -- style SQL comments from the query string before parsing. This is a good step towards making the SET statement parser more robust.

However, there are a few key areas to consider:

Consistency with parse1v2(): The parse1v2() method, which also parses SET statements, does not include this new comment removal logic. This could lead to inconsistencies.
Handling of Other Comment Types: The current change only addresses -- comments. If other comment types (like # or /* ... */) can also interfere with SET parsing in parse1(), this fix might be partial.
Testing: Adding specific test cases for SET statements with -- comments would strengthen the change.
PR Description: The pull request description is currently empty. Adding a brief explanation of the problem and the solution would be helpful for reviewers and future maintainers.

Overall, the core change for -- comments in parse1() is implemented correctly with the given regex.

Summary of Findings

Inconsistent Comment Handling with parse1v2: The parse1() method now removes -- comments, but parse1v2(), which also parses SET statements, lacks this logic. This could lead to differing behaviors. (Severity: High)
Handling of Other Comment Types in parse1: The PR addresses -- comments. If # or /* ... */ comments can also disrupt SET parsing in parse1(), the current fix might be partial. (Severity: Medium)
Need for Specific Test Cases: Adding test cases to verify the removal of -- comments in various SET statement scenarios would improve confidence in this change. (Severity: Medium)
Comment Clarity: The comment // remove comments for re2 on line 88 is slightly imprecise as it only removes -- style comments. A more specific comment like // remove -- style SQL comments would be clearer. (Severity: Low - Not commented due to review settings)
Pull Request Description: The pull request description is empty. Providing context about the problem and solution enhances reviewability and maintainability. (Severity: Low - Not commented due to review settings)

Merge Readiness

The pull request addresses an important aspect of query parsing by handling -- comments in SET statements. However, before merging, I recommend addressing the high-severity issue regarding the inconsistency with parse1v2(). Additionally, considering the handling of other comment types and adding specific test cases would further improve the robustness and reliability of this functionality. As a reviewer, I am not authorized to approve pull requests, but these changes are suggested to enhance code quality. Please ensure other reviewers approve this PR before merging.

gemini-code-assist · 2025-05-21T12:38:24Z

lib/set_parser.cpp

+	re2::RE2 re2("--.*$", *opt2); // remove comments
+	re2::RE2::Replace(&query, re2, "");


This change correctly adds removal of -- comments for the parse1() method using the regex "--.*$".

However, the parse1v2() method (defined later in this file, e.g., lines 378-381 show its own SET prefix and suffix trimming) also parses SET statements but does not include similar logic to remove -- comments.

Could this lead to inconsistent behavior or parsing failures if parse1v2() encounters SET queries that include -- comments? If parse1v2() is intended to have feature parity with parse1() regarding comment handling for SET statements, it might require a similar update to ensure consistent parsing behavior.

gemini-code-assist · 2025-05-21T12:38:24Z

lib/set_parser.cpp

+	re2::RE2 re2("--.*$", *opt2); // remove comments
+	re2::RE2::Replace(&query, re2, "");


This regex "--.*$" effectively handles -- comments by removing them up to the end of the line.

It's worth considering if other SQL comment types, such as # (hash) comments or /* ... */ block comments, can also appear in SET statements in a way that might interfere with the main parsing logic (the complex regex at line 93).

If these other comment types can also cause parsing issues within SET statements, would a more comprehensive comment removal strategy be beneficial for parse1()? This might involve leveraging or adapting the existing SetParser::remove_comments method (lines 572-612), which handles multiple comment types, or adding specific regexes for other comment styles to ensure robustness against various comment formats in SET queries.

renecannao · 2025-05-21T12:41:06Z

Can one of the admins verify this patch?

…omments, not before Signed-off-by: James Lawrie <[email protected]>

Remove comments from query when parsing SET

a069b3f

Signed-off-by: James Lawrie <[email protected]>

gemini-code-assist bot reviewed May 21, 2025

View reviewed changes

jdlawrie mentioned this pull request May 21, 2025

Unable to parse unknown SET query Setting lock_hostgroup. Please report a bug for future enhancements:SET @@innodb_lock_wait_timeout = 1 #4956

Open

gemini-code-assist bot suggested changes May 21, 2025

View reviewed changes

It's probably a better idea to remove the spaces after removing the c…

77d3c42

…omments, not before Signed-off-by: James Lawrie <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove comments from query when parsing SET #4959

Remove comments from query when parsing SET #4959

Uh oh!

jdlawrie commented May 21, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot May 21, 2025

Uh oh!

gemini-code-assist bot May 21, 2025

Uh oh!

renecannao commented May 21, 2025

Uh oh!

Uh oh!

		re2::RE2 re2("--.$", opt2); // remove comments
		re2::RE2::Replace(&query, re2, "");

Remove comments from query when parsing SET #4959

Are you sure you want to change the base?

Remove comments from query when parsing SET #4959

Uh oh!

Conversation

jdlawrie commented May 21, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Changelog

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Summary of Findings

Merge Readiness

Uh oh!

gemini-code-assist bot May 21, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot May 21, 2025

Choose a reason for hiding this comment

Uh oh!

renecannao commented May 21, 2025

Uh oh!

Uh oh!