Skip to content

V3.0 PostgreSQL Prepared statements - draft 2 #4953

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 34 commits into
base: v3.0
Choose a base branch
from

Conversation

renecannao
Copy link
Contributor

2nd draft

Merge of v3.0 into #4836

Also adding new debug modules
process_mysql_query() extract queries at different offsets depending if:
* MySQL (COM_QUERY, COM_STMT_PREPARE, COM_STMT_EXECUTE)
* PostgreSQL (Simple query)
* PostgreSQL (PARSE)
Not working, just placeholders
PgSQL_Connection.cpp now includes extra header to access PostgreSQL internals.
To achieve that some postgres headers were modified, and ProxySQL code
needs to define PG_SYM_PROXYSQL (see PgSQL_Connection.cpp as example).
Use of PQsendQueryPrepared() instead of PQsendQueryParams() .
PQsendQueryParams() sends also a Parse packet , but we had the query already parsed .

Fixed how resultFormat is evaluated.

Hardcoded stmtName `proxysql_ps_1` : this is temporary
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @renecannao, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request represents the second draft of the implementation for PostgreSQL prepared statements support in ProxySQL, as part of the larger v3.0 development effort. The core changes involve introducing base classes for prepared statement handling and then specializing them for the PostgreSQL protocol. This includes managing prepared statement metadata globally and locally per connection, parsing PostgreSQL's prepared statement messages (Parse, Bind, Describe, Execute), and integrating this logic into the PostgreSQL session and connection handling. New debug modules and test cases are also added to support and validate the new functionality.

Highlights

  • PostgreSQL Prepared Statements: Introduces foundational support for PostgreSQL prepared statements, allowing clients to use the Parse, Bind, Describe, and Execute protocol messages through ProxySQL.
  • Base Classes for Prepared Statements: Refactors prepared statement logic into generic base classes (Base_STMT_Global_info, Base_STMTs_local_v14, Base_STMT_Manager_v14) to be shared between different database protocols (currently MySQL and PostgreSQL).
  • PostgreSQL Protocol Parsing: Adds new classes (PgBindPacket, PgDescribePacket, PgExecutePacket, PgParsePacket) and parsing functions to handle the specific packet formats used by the PostgreSQL frontend/backend protocol for prepared statements.
  • Session and Connection Integration: Updates the PostgreSQL session and connection handling logic to recognize and process the new prepared statement messages, managing the state transitions (PGSQL_PROCESSING_PARSE, PGSQL_PROCESSING_BIND, etc.) and interacting with the new prepared statement management structures.
  • Testing: Adds several new test cases (PG_PS01-t.cpp through PG_PS06.2-t.cpp) specifically designed to validate the new PostgreSQL prepared statement functionality using the libpq library, including basic prepare/execute, multiple connections, and asynchronous handling.

Changelog

Click here to see the changelog
  • deps/Makefile
    • Added patches for PostgreSQL's c.h and port.h includes to the build process (lines 309-310).
    • Added new object files for base and PostgreSQL prepared statement implementations to the build (Base_PreparedStatement.oo, MySQL_PreparedStatement.oo, PgSQL_PreparedStatement.oo) (line 155).
  • deps/postgresql/c.h.patch
    • Added #ifndef PG_SYM_PROXYSQL guards around pg_attribute_aligned and uint128 definitions in PostgreSQL's c.h to prevent potential conflicts (lines 7-9, 18-20).
  • deps/postgresql/port.h.patch
    • Added #ifndef PG_SYM_PROXYSQL guards around portable delay handling and case-independent comparison functions in PostgreSQL's port.h (lines 7-15).
  • include/Base_PreparedStatement.h
    • Added base class Base_STMT_Global_info for global prepared statement metadata (lines 52-85).
    • Added templated base class Base_STMTs_local_v14 for connection-local prepared statement mapping (lines 198-256).
    • Added templated base class Base_STMT_Manager_v14 for global prepared statement management (lines 259-295).
  • include/MySQL_PreparedStatement.h
    • Included Base_PreparedStatement.h (line 4).
    • Made MySQL_STMT_Global_info inherit from Base_STMT_Global_info and removed duplicated members (line 44).
    • Added is_mysql flag to stmt_execute_metadata_t (line 61).
    • Made MySQL_STMTs_local_v14 inherit from Base_STMTs_local_v14 and removed duplicated members (line 165).
    • Made MySQL_STMT_Manager_v14 inherit from Base_STMT_Manager_v14 and removed duplicated members (line 191).
  • include/MySQL_Protocol.h
    • Commented out the include for MySQL_Prepared_Stmt_info.h (line 7).
    • Commented out the current_PreStmt member (line 136).
  • include/PgSQL_Connection.h
    • Added declarations for prepared statement handler functions (stmt_execute_start, stmt_execute_cont, stmt_prepare_start, stmt_prepare_cont) (lines 291-294).
    • Added interr and local_stmts members (lines 504, 506).
  • include/PgSQL_Data_Stream.h
    • Renamed PgSQL_MyDS_real_query to PgDS_real_query (line 30).
    • Modified PgDS_real_query constructor to handle 'P' (Parse) packets and extract the query string correctly (lines 45-57).
    • Updated the member name from mysql_real_query to mysql_real_query with a FIXME comment (line 103).
  • include/PgSQL_Logger.hpp
    • Removed unused client_stmt_id member (line 35).
  • include/PgSQL_PreparedStatement.h
    • Included Base_PreparedStatement.h (line 4).
    • Defined PROXYSQL_PS_PREFIX (line 9).
    • Added PgSQL_STMT_Global_info class inheriting from Base_STMT_Global_info (lines 11-16).
    • Added PgSQL_STMTs_local_v14 class inheriting from Base_STMTs_local_v14 and added PostgreSQL-specific maps (stmt_name_to_id, stmt_id_to_name) (lines 20-50).
    • Added PgSQL_STMT_Manager_v14 class inheriting from Base_STMT_Manager_v14 (lines 53-59).
  • include/PgSQL_Protocol.h
    • Added new classes for parsing PostgreSQL prepared statement packets: PgBindPacket, PgDescribePacket, PgExecutePacket, PgParsePacket (lines 80-146).
    • Added generate_ParseComplete function (line 837).
    • Commented out current_PreStmt member (line 668).
    • Added static parse functions for Describe and Execute packets (lines 986-987).
  • include/PgSQL_Session.h
    • Added forward declarations for PgBindPacket and PgParsePacket (lines 16-17).
    • Added stmt_name, BindPacket, and ParsePacket members to PgSQL_Query_Info (lines 156, 161-162).
    • Removed commented-out handler declarations (lines 275-280, 284-288, 292-294).
    • Added declarations for new PostgreSQL prepared statement handlers (handler___status_WAITING_CLIENT_DATA___STATE_SLEEP___PGSQL_PARSE, handler___status_WAITING_CLIENT_DATA___STATE_SLEEP___PGSQL_BIND) (lines 297-298).
    • Removed handler_rc0_Process_GTID declaration (line 311).
    • Removed commented-out members (lines 374-391, 457-458).
  • include/proxysql_debug.h
    • Added PROXY_TRACE3 macro for debug logging (lines 39-40).
  • include/proxysql_structs.h
    • Added new debug module enums for PostgreSQL components (lines 159-166).
    • Added new session status enums for PostgreSQL prepared statement processing (PGSQL_PROCESSING_PARSE, PGSQL_PROCESSING_BIND, PGSQL_PROCESSING_DESCRIBE, PGSQL_PROCESSING_EXECUTE) (lines 313-316).
    • Added forward declarations for PostgreSQL STMT classes (lines 825-826).
  • include/proxysql_utils.h
    • Added inline utility functions read_big_endian_int16 and read_big_endian_int32 for reading big-endian integers from buffers (lines 261-274).
    • Added include guard at the end of the file (line 336).
  • include/query_processor.h
    • Commented out several templated or MySQL-specific function declarations related to query parsing and digest updates (lines 273-297).
  • lib/Base_PreparedStatement.cpp
    • Explicitly instantiated templates for MySQL and PostgreSQL STMT types (lines 13-25).
    • Implemented static_stmt_compute_hash and Base_STMT_Global_info::stmt_compute_hash (lines 53-99).
    • Implemented Base_STMT_Global_info::compute_hash (lines 101-104).
    • Commented out MySQL-specific prepared statement info and long data handling (lines 106-186).
    • Implemented Base_STMT_Global_info constructor (lines 187-188).
    • Commented out MySQL-specific constructor logic (lines 189-346).
    • Commented out MySQL_STMT_Global_info::calculate_mem_usage and update_metadata (lines 349-566).
    • Implemented Base_STMT_Global_info destructor (lines 568-574).
    • Commented out MySQL-specific destructor logic (lines 575-621).
    • Commented out MySQL-specific digest_text free (lines 622-625).
    • Commented out MySQL_STMTs_local_v14::backend_insert (lines 628-640).
    • Implemented templated Base_STMTs_local_v14::compute_hash (lines 642-649).
    • Implemented templated Base_STMT_Manager_v14 constructor (lines 651-655).
    • Commented out MySQL-specific manager constructor logic (lines 657-671).
    • Implemented templated Base_STMT_Manager_v14 destructor (lines 674-680).
    • Implemented templated Base_STMT_Manager_v14::ref_count_client (lines 682-757).
    • Implemented templated Base_STMT_Manager_v14::ref_count_server (lines 759-779).
    • Commented out MySQL_STMTs_local_v14 destructor (lines 781-802).
    • Implemented templated Base_STMT_Manager_v14::find_prepared_statement_by_hash (lines 804-813).
    • Implemented templated Base_STMT_Manager_v14::find_prepared_statement_by_stmt_id (lines 815-831).
    • Implemented templated Base_STMTs_local_v14::generate_new_client_stmt_id (lines 835-852).
    • Implemented templated Base_STMTs_local_v14::find_global_stmt_id_from_client (lines 854-862).
    • Implemented templated Base_STMTs_local_v14::client_close (lines 864-886).
    • Commented out MySQL_STMT_Manager_v14::add_prepared_statement (lines 888-942).
    • Commented out MySQL_STMT_Manager_v14::get_memory_usage (lines 945-969).
    • Implemented templated Base_STMT_Manager_v14::get_metrics (lines 972-1019).
    • Added PS_global_stats helper class for dumping stats (lines 1022-1092).
    • Implemented templated Base_STMT_Manager_v14::get_prepared_statements_global_infos (lines 1095-1123).
  • lib/Base_Session.cpp
    • Modified init to allocate MySQL_STMTs_meta for PgSQL_Session as well (line 88).
  • lib/Makefile
    • Added Base_PreparedStatement.oo, MySQL_PreparedStatement.oo, and PgSQL_PreparedStatement.oo to the list of C++ object files (line 155).
  • lib/MySQL_PreparedStatement.cpp
    • Removed duplicated stmt_compute_hash and compute_hash functions (lines 21-64).
    • Removed duplicated constructor logic (lines 141-149, 158-161).
    • Removed duplicated destructor logic (lines 521-532, 570-576).
    • Removed duplicated MySQL_STMT_Manager_v14 constructor/destructor and related methods (lines 599-720, 744-816, 899-1049).
  • lib/MySQL_Protocol.cpp
    • Commented out the initialization of current_PreStmt (line 126).
  • lib/PgSQL_Connection.cpp
    • Added #define PG_SYM_PROXYSQL (line 17).
    • Changed debug module from PROXY_DEBUG_MYSQL_PROTOCOL to PROXY_DEBUG_PGSQL_PROTOCOL in handler (line 281).
    • Added bytes sent updates for query execution (lines 376-378).
    • Commented out MySQL-specific STMT handling in handler (lines 402-602).
    • Added PROXY_TRACE3 calls in fetch_result_start and fetch_result_cont (lines 1190, 1197).
    • Added PROXY_TRACE3 call in flush (line 1254).
    • Added logic in async_query to set async_state_machine based on session status (PGSQL_PROCESSING_PARSE, PGSQL_PROCESSING_BIND, etc.) (lines 1449-1452).
    • Commented out MySQL-specific STMT handling in async_query (lines 1454-1507).
    • Changed debug module from PROXY_DEBUG_MYSQL_PROTOCOL to PROXY_DEBUG_PGSQL_PROTOCOL in reset_session_cont (line 1763).
  • lib/PgSQL_Logger.cpp
    • Commented out accessing stmt_info->digest for logging when status is PROCESSING_STMT_EXECUTE (line 717).
    • Commented out accessing stmt_info->query and stmt_info->query_length for logging when status is PROCESSING_STMT_EXECUTE (lines 730-735).
  • lib/PgSQL_PreparedStatement.cpp
    • Included PgSQL_Protocol.h (line 10).
    • Added destructor for PgSQL_STMTs_local_v14 (lines 17-38).
    • Added destructor for PgSQL_STMT_Global_info (lines 40-55).
    • Added constructor/destructor for PgSQL_STMT_Manager_v14 (lines 57-61).
    • Implemented generate_new_backend_id for PgSQL_STMTs_local_v14 (lines 64-78).
  • lib/PgSQL_Protocol.cpp
    • Added Doxygen comments for write_generic (lines 100-137).
    • Implemented generate_ParseComplete function (lines 1557-1579).
    • Implemented PgBindPacket::parseBindPacket (lines 2478-2590).
    • Implemented PgSQL_Protocol::parseDescribePacket (lines 2592-2638).
    • Implemented PgSQL_Protocol::parseExecutePacket (lines 2641-2688).
    • Implemented PgParsePacket::parseParsePacket (lines 2691-2772).
  • lib/PgSQL_Session.cpp
    • Included PgSQL_PreparedStatement.h (line 18).
    • Changed GloMyStmt to GloPgStmt (line 118).
    • Added destructor logic for BindPacket and ParsePacket in PgSQL_Query_Info (lines 327-334).
    • Modified PgSQL_Query_Info::init to handle 'P' (Parse) packets and extract statement name and query (lines 389-394).
    • Commented out MySQL-specific STMT metadata handling in PgSQL_Query_Info::end (lines 373-384).
    • Removed commented-out handler declarations (lines 282-289, 291-294, 309-310, 349-351, 366-393).
    • Added declarations for new PostgreSQL prepared statement handlers (handler___status_WAITING_CLIENT_DATA___STATE_SLEEP___PGSQL_PARSE, handler___status_WAITING_CLIENT_DATA___STATE_SLEEP___PGSQL_BIND) (lines 297-298).
    • Removed handler_rc0_Process_GTID declaration (line 311).
    • Changed debug module from PROXY_DEBUG_MYSQL_CONNECTION to PROXY_DEBUG_PGSQL_CONNECTION in handler_again___verify_backend_user_db (lines 1442-1443).
    • Changed debug module from PROXY_DEBUG_MYSQL_PROTOCOL to PROXY_DEBUG_PGSQL_PROTOCOL in reset_session_cont (line 1763).
    • Added implementations for handler___status_WAITING_CLIENT_DATA___STATE_SLEEP___PGSQL_PARSE and handler___status_WAITING_CLIENT_DATA___STATE_SLEEP___PGSQL_BIND (lines 2097-2245, 2254-2400).
    • Changed debug module from PROXY_DEBUG_MYSQL_COM to PROXY_DEBUG_PGSQL_COM in various handlers (lines 2603, 3075, 4513, 4536, 4572, 4599, 4805, 4907, 4978, 4983, 5011, 5017, 5030, 5063, 5122, 5125, 5136, 5549, 5557, 5637).
    • Changed debug module from PROXY_DEBUG_MYSQL_QUERY_PROCESSOR to PROXY_DEBUG_PGSQL_QUERY_PROCESSOR in various handlers (lines 4763, 4819, 6240, 6372).
    • Changed debug module from PROXY_DEBUG_MYSQL_CONNECTION to PROXY_DEBUG_PGSQL_CONNECTION in various handlers (lines 2622, 3668, 4111, 4178, 4195, 4205, 4268, 4292, 4299, 4308, 4414, 4422, 4434, 4479, 4483, 5594, 5775, 5780, 5785).
    • Added debug print for received packet type in get_pkts_from_client (lines 2660-2670).
    • Added state transitions for PGSQL_PROCESSING_PARSE, PGSQL_PROCESSING_BIND, PGSQL_PROCESSING_DESCRIBE, PGSQL_PROCESSING_EXECUTE in get_pkts_from_client (lines 2688-2748).
    • Commented out MySQL-specific STMT handling in handler (lines 3872-3893).
    • Commented out MySQL-specific STMT error handling in handler_minus1_GenerateErrorMessage (lines 3550-3592).
    • Removed uuid and trxid from handler___client_DSS_QUERY_SENT___server_DSS_NOT_INITIALIZED (lines 5688-5689).
    • Removed conditional logic for trxid in get_MyConn_from_pool call (lines 5719-5722).
  • lib/PgSQL_Thread.cpp
    • Commented out accessing stmt_info->query_length and stmt_info->query in SQL3_Processlist (lines 4739-4747).
  • lib/ProxySQL_Admin_Tests.cpp
    • Commented out MySQL-specific update_query_digest calls in ProxySQL_Test___GenerateRandomQueryInDigestTable (lines 140-146).
  • lib/Query_Processor.cpp
    • Included MySQL_PreparedStatement.h and PgSQL_PreparedStatement.h (lines 17-18).
    • Commented out large blocks of templated code related to query processing and digest updates (lines 1290-1352, 1929-1987, 1990-1993).
  • lib/debug.cpp
    • Added names for the new PostgreSQL-specific debug modules (lines 544-551).
  • src/main.cpp
    • Included PgSQL_PreparedStatement.h (line 25).
    • Declared GloPgStmt (line 483).
    • Initialized GloPgStmt (line 909).
    • Created a new instance of PgSQL_STMT_Manager_v14 for GloPgStmt (line 928).
    • Added GloPgStmt to the shutdown sequence (lines 1295-1297).
  • test/tap/tests/PG_PS01-t.cpp
    • Added a new test file to perform a basic PostgreSQL prepared statement prepare and execute using libpq.
  • test/tap/tests/PG_PS02-t.cpp
    • Added a new test file to test asynchronous PostgreSQL prepared statement prepare and execute on multiple connections using libpq.
  • test/tap/tests/PG_PS03-t.cpp
    • Added a new test file to test asynchronous PostgreSQL prepared statement prepare and execute on multiple connections using libpq, processing results with PQgetResult in a loop.
  • test/tap/tests/PG_PS04-t.cpp
    • Added a new test file similar to PG_PS03-t.cpp, but with slightly different result processing logic.
  • test/tap/tests/PG_PS05-t.cpp
    • Added a new test file to test asynchronous PostgreSQL prepared statement prepare and execute on multiple connections using libpq and poll, attempting to process results in single-row mode.
  • test/tap/tests/PG_PS06-t.cpp
    • Added a new test file to test asynchronous PostgreSQL prepared statement prepare and execute on multiple connections using libpq and poll, managing connection states and processing results row by row.
  • test/tap/tests/PG_PS06.2-t.cpp
    • Added a new test file similar to PG_PS06-t.cpp, with refinements in state management and result processing loops.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


Postgres packets flow,
Parse, Bind, Describe, Execute go,
ProxySQL sees all,
Ready to stand tall,
Prepared statements start to grow.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request makes significant progress towards implementing prepared statement support for PostgreSQL by introducing a base layer for prepared statement handling and then specializing it for both MySQL and PostgreSQL. The refactoring to use base classes (Base_STMT_Global_info, Base_STMTs_local_v14, Base_STMT_Manager_v14) is a good step towards code reuse and maintainability.

The introduction of PostgreSQL-specific packet parsing classes (PgParsePacket, PgBindPacket, etc.) and updates to the PgSQL session and connection handling are crucial for this feature.

Overall, the direction is good, but there are several areas that need attention, particularly regarding correctness, potential memory leaks, and completeness of the PostgreSQL implementation. Some FIXME comments also point to areas that will need further work.

Summary of Findings

  • Memory Leak in PgSQL_Session Parse/Bind Handlers: Potential memory leaks identified in handler___status_WAITING_CLIENT_DATA___STATE_SLEEP___PGSQL_PARSE and handler___status_WAITING_CLIENT_DATA___STATE_SLEEP___PGSQL_BIND where PgParsePacket or PgBindPacket might not be deleted in all execution paths, especially error paths or cache hit scenarios.
  • Incorrect Type Usage in Base_Session::init: In lib/Base_Session.cpp, sess_STMTs_meta for PgSQL_Session is initialized with new MySQL_STMTs_meta(). This is incorrect and should use a PostgreSQL-specific or generic type.
  • Ownership and Lifetime of Packet Buffers: The new packet parsing classes (PgBindPacket, PgParsePacket) take ownership of packet buffers. Clarity on the allocation and deallocation strategy for these buffers is needed to ensure memory safety.
  • Incomplete PostgreSQL Functionality: Some PostgreSQL-specific functionalities, like parts of asynchronous statement execution in PgSQL_Connection.cpp and prepared statement logging in PgSQL_Logger.cpp, appear to be commented out or incomplete.
  • FIXME Comments: Several FIXME comments indicate areas for future improvement or unresolved issues. These should be tracked and addressed.

Merge Readiness

This pull request introduces a significant and valuable feature for PostgreSQL prepared statement support. The refactoring into base classes is a positive step for long-term maintainability.

However, there are several critical and high-severity issues that need to be addressed before this PR can be considered ready for merging:

  1. The incorrect instantiation of MySQL_STMTs_meta for PgSQL_Session in Base_Session.cpp must be fixed.
  2. Potential memory leaks in the new PostgreSQL packet handlers in PgSQL_Session.cpp need to be resolved.
  3. Clarity on the ownership and lifecycle of packet buffers passed to the new parsing classes is required to ensure memory safety.

Additionally, the commented-out sections related to PostgreSQL asynchronous statement execution and logging should either be completed or removed if they are out of scope for this PR.

I am unable to approve pull requests. Another reviewer will need to perform the final approval. I recommend addressing the identified critical and high severity issues before merging.

@@ -85,7 +85,7 @@ void Base_Session<S,DS,B,T>::init() {
sess_STMTs_meta = new MySQL_STMTs_meta();
SLDH = new StmtLongDataHandler();
} else if constexpr (std::is_same_v<S, PgSQL_Session>) {
sess_STMTs_meta = NULL;
sess_STMTs_meta = new MySQL_STMTs_meta();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

In the Base_Session::init() method, for PgSQL_Session, sess_STMTs_meta is being initialized as new MySQL_STMTs_meta(). This seems incorrect and likely a copy-paste error. Shouldn't this be new PgSQL_STMTs_meta() or a generic base class if one exists for STMTs_meta? Using MySQL_STMTs_meta in a PostgreSQL session could lead to unexpected behavior or crashes due to type mismatches or incorrect logic.

Suggested change
sess_STMTs_meta = new MySQL_STMTs_meta();
sess_STMTs_meta = new PgSQL_STMTs_meta(); // Or a generic base class if applicable

Comment on lines +2108 to +2244
(begint.tv_sec * 1000000000 + begint.tv_nsec);
}
assert(qpo); // GloPgQPro->process_mysql_query() should always return a qpo
// setting 'prepared' to prevent fetching results from the cache if the digest matches
/*
// FIXME: no handling of special queries for now
rc_break = handler___status_WAITING_CLIENT_DATA___STATE_SLEEP___MYSQL_COM_QUERY_qpo(&pkt, &lock_hostgroup, PgSQL_ps_type_prepare_stmt);
if (rc_break == true) {
return;
}
*/
if (pgsql_thread___set_query_lock_on_hostgroup == 1) { // algorithm introduced in 2.0.6
if (locked_on_hostgroup < 0) {
if (lock_hostgroup) {
// we are locking on hostgroup now
locked_on_hostgroup = current_hostgroup;
}
}
if (locked_on_hostgroup >= 0) {
if (current_hostgroup != locked_on_hostgroup) {
client_myds->DSS = STATE_QUERY_SENT_NET;
int l = CurrentQuery.QueryLength;
char* end = (char*)"";
if (l > 256) {
l = 253;
end = (char*)"...";
}
string nqn = string((char*)CurrentQuery.QueryPointer, l);
char* err_msg = (char*)"Session trying to reach HG %d while locked on HG %d . Rejecting query: %s";
char* buf = (char*)malloc(strlen(err_msg) + strlen(nqn.c_str()) + strlen(end) + 64);
sprintf(buf, err_msg, current_hostgroup, locked_on_hostgroup, nqn.c_str(), end);
client_myds->myprot.generate_error_packet(true, true, buf, PGSQL_ERROR_CODES::ERRCODE_RAISE_EXCEPTION,
false, true);
thread->status_variables.stvar[st_var_hostgroup_locked_queries]++;
RequestEnd(NULL);
free(buf);
l_free(pkt.size, pkt.ptr);
return;
}
}
}
mybe = find_or_create_backend(current_hostgroup);
//if (client_myds->myconn->local_stmts == NULL) {
// client_myds->myconn->local_stmts = new PgSQL_STMTs_local_v14(true);
//}
PgSQL_STMTs_local_v14 *local_stmts = client_myds->myconn->local_stmts;
std::string stmt_name = (char *)CurrentQuery.stmt_name; // create a string
// if the same statement name is used, we drop it
if (auto search = local_stmts->stmt_name_to_id.find(stmt_name); search != local_stmts->stmt_name_to_id.end()) {
uint64_t client_global_id = search->second;
auto search2 = local_stmts->stmt_id_to_name.find(client_global_id);
assert(search2 != local_stmts->stmt_id_to_name.end());
local_stmts->stmt_id_to_name.erase(search2);
local_stmts->stmt_name_to_id.erase(search);
uint64_t stmt_global_id = 0;
stmt_global_id = client_myds->myconn->local_stmts->find_global_stmt_id_from_client(client_global_id);
client_myds->myconn->local_stmts->client_close(client_global_id);
}
uint64_t hash = client_myds->myconn->local_stmts->compute_hash(
(char*)client_myds->myconn->userinfo->username,
(char*)client_myds->myconn->userinfo->dbname,
(char*)CurrentQuery.QueryPointer,
CurrentQuery.QueryLength
);
PgSQL_STMT_Global_info* stmt_info = NULL;
// we first lock GloStmt
GloPgStmt->wrlock();
stmt_info = GloPgStmt->find_prepared_statement_by_hash(hash);
if (stmt_info) {
// the prepared statement exists in GloMyStmt
// for this reason, we do not need to prepare it again, and we can already reply to the client
// we will now generate a unique stmt and send it to the client
uint32_t new_stmt_id = client_myds->myconn->local_stmts->generate_new_client_stmt_id(stmt_info->statement_id);
CurrentQuery.stmt_client_id = new_stmt_id;
client_myds->setDSS_STATE_QUERY_SENT_NET();
// FIXME: disabled for now
// client_myds->myprot.generate_STMT_PREPARE_RESPONSE(client_myds->pkt_sid + 1, stmt_info, new_stmt_id);
LogQuery(NULL);
l_free(pkt.size, pkt.ptr);
client_myds->DSS = STATE_SLEEP;
status = WAITING_CLIENT_DATA;
CurrentQuery.end_time = thread->curtime;
CurrentQuery.end();
}
else {
mybe = find_or_create_backend(current_hostgroup);
//status = PROCESSING_STMT_PREPARE;
status = PGSQL_PROCESSING_PARSE;
mybe->server_myds->connect_retries_on_failure = pgsql_thread___connect_retries_on_failure;
mybe->server_myds->wait_until = 0;
pause_until = 0;
mybe->server_myds->killed_at = 0;
mybe->server_myds->kill_type = 0;
mybe->server_myds->mysql_real_query.init(&pkt); // fix memory leak for PREPARE in prepared statements #796
mybe->server_myds->statuses.questions++;
client_myds->setDSS_STATE_QUERY_SENT_NET();
}
GloPgStmt->unlock();
return; // make sure to not return before unlocking GloMyStmt
// }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In handler___status_WAITING_CLIENT_DATA___STATE_SLEEP___PGSQL_PARSE, if a prepared statement with the same hash already exists (stmt_info is not NULL after find_prepared_statement_by_hash), the code proceeds to generate a new client statement ID and sends a response. However, the parsePacket (allocated at the beginning of the function) is not deleted in this branch. This will lead to a memory leak.

parsePacket should be deleted before returning if the statement is found in the cache. Consider moving CurrentQuery.ParsePacket = parsePacket; to after the cache check if parsePacket is only needed when the statement is new, or ensure delete parsePacket; in the cache-hit path.

status = WAITING_CLIENT_DATA;
return;
}
CurrentQuery.BindPacket = bindPacket;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In handler___status_WAITING_CLIENT_DATA___STATE_SLEEP___PGSQL_BIND, CurrentQuery.BindPacket = bindPacket; is called. If this function returns early due to an error (e.g., invalid BIND packet, line 2267-2276), bindPacket (which was new-ed) will be leaked because CurrentQuery.end() or the PgSQL_Query_Info destructor (which would delete BindPacket) won't be called in that error path.

Ensure delete bindPacket; is called in all error paths before returning if CurrentQuery.BindPacket hasn't taken ownership or if the function exits before normal cleanup of CurrentQuery.

Comment on lines +59 to +61
// FIXME: for now we will differentiate between mysql and pgsql using this flag
// later we will move into the use of template
bool is_mysql = true;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The is_mysql flag here is noted with a FIXME to use templates later. While this works for now, could using a template approach for stmt_execute_metadata_t (perhaps making it a template class itself, or having a common base and then MySQL/PgSQL specific derived classes) be considered for better type safety and to avoid protocol-specific flags in shared structures? This would align well with the general refactoring approach taken for other prepared statement classes.

Comment on lines +98 to +107
~PgBindPacket() {
delete[] param_values;
delete[] param_lengths;
delete[] param_formats;
delete[] result_formats;
if (pkt_ptr != nullptr) {
free(pkt_ptr);
pkt_ptr = nullptr;
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In PgBindPacket::~PgBindPacket, memory is freed for param_values, param_lengths, param_formats, and result_formats using delete[]. This is correct if these members were allocated with new[].

However, the parseBindPacket method (lines 2478-2589 in lib/PgSQL_Protocol.cpp) uses new int[paramFormatCount] etc. for these. This is consistent.

The pkt_ptr is freed with free(pkt_ptr). This implies pkt_ptr was allocated with malloc/calloc/realloc. In PgBindPacket::parseBindPacket, this->pkt_ptr = pkt.ptr; is done, taking ownership. It's crucial that pkt.ptr passed to parseBindPacket was indeed malloc-family allocated if this free is to be safe. Could you confirm the lifecycle and allocation strategy for the packet buffer that pkt.ptr points to when parseBindPacket is called? If it's managed by another system (e.g., a PtrSizeArray that might free it differently or not expect it to be freed here), this could lead to double-free or memory corruption.

Comment on lines +155 to +162
MYSQL_STMT* mysql_stmt; // FIXME: to be removed
unsigned char *stmt_name = NULL;
stmt_execute_metadata_t* stmt_meta; // FIXME: to be removed
uint64_t stmt_global_id;
uint64_t stmt_client_id;
MySQL_STMT_Global_info* stmt_info;
PgSQL_STMT_Global_info* stmt_info;
PgBindPacket *BindPacket = NULL;
PgParsePacket *ParsePacket = NULL;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The mysql_stmt and stmt_meta members are marked with // FIXME: to be removed. This is good to note.
Are there plans to replace their functionality entirely with the new PgBindPacket and PgParsePacket members, or will there be an intermediate PostgreSQL-specific structure similar to stmt_execute_metadata_t but for PgSQL?

#include "PgSQL_Data_Stream.h"
#include "PgSQL_Query_Processor.h"
#include "MySQL_Variables.h"

#define PG_SYM_PROXYSQL

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Defining PG_SYM_PROXYSQL here is crucial for the patches in deps/postgresql/c.h.patch and deps/postgresql/port.h.patch to take effect. This ensures that the PostgreSQL client library code compiled within ProxySQL uses the ProxySQL-specific modifications. This is a good way to manage conditional compilation for patched dependencies.

Is this define scoped only to this compilation unit, or is it intended to be a global define when compiling PostgreSQL sources? If it's only for this file, it won't affect how the libpq.a is built. The patching in deps/Makefile happens before building libpq.a, so those patches will be active during the libpq.a build if PG_SYM_PROXYSQL is defined appropriately in the PostgreSQL build environment (e.g., via CFLAGS or by modifying PostgreSQL's own config headers/makefiles if the patch files themselves don't define it when needed). The current approach of defining it in PgSQL_Connection.cpp might be for ProxySQL's own code that might include PostgreSQL headers directly, rather than for building libpq.a itself. Could you clarify the scope and intent of this define here?

Comment on lines +402 to +602
//async_fetch_row_start=false;
}
}
stmt_execute_store_result_start();
if (async_exit_status) {
next_event(ASYNC_STMT_EXECUTE_STORE_RESULT_CONT);
} else {
NEXT_IMMEDIATE(ASYNC_STMT_EXECUTE_END);
}
break;

case ASYNC_STMT_EXECUTE_STORE_RESULT_CONT:
PROXY_TRACE2();
{ // this copied mostly from ASYNC_USE_RESULT_CONT
if (myds->sess && myds->sess->client_myds && myds->sess->mirror==false) {
unsigned int buffered_data=0;
buffered_data = myds->sess->client_myds->PSarrayOUT->len * RESULTSET_BUFLEN;
//buffered_data += myds->sess->client_myds->resultset->len * RESULTSET_BUFLEN;
if (buffered_data > (unsigned int)mysql_thread___threshold_resultset_size*8) {
next_event(ASYNC_STMT_EXECUTE_STORE_RESULT_CONT); // we temporarily pause . See #1232
break;
}
}
}
stmt_execute_store_result_cont(event);
//if (async_fetch_row_start==false) {
// async_fetch_row_start=true;
//}
if (async_exit_status) {
// this copied mostly from ASYNC_USE_RESULT_CONT
MYSQL_ROWS *r=query.stmt->result.data;
long long unsigned int rows_read_inner = 0;

if (r) {
rows_read_inner++;
while(rows_read_inner < query.stmt->result.rows) {
// it is very important to check rows_read_inner FIRST
// because r->next could point to an invalid memory
rows_read_inner++;
r = r->next;
}
if (rows_read_inner > 1) {
process_rows_in_ASYNC_STMT_EXECUTE_STORE_RESULT_CONT(processed_bytes);
if (
(processed_bytes > (unsigned int)mysql_thread___threshold_resultset_size*8)
||
( mysql_thread___throttle_ratio_server_to_client && mysql_thread___throttle_max_bytes_per_second_to_client && (processed_bytes > (unsigned long long)mysql_thread___throttle_max_bytes_per_second_to_client/10*(unsigned long long)mysql_thread___throttle_ratio_server_to_client) )
) {
next_event(ASYNC_STMT_EXECUTE_STORE_RESULT_CONT); // we temporarily pause
} else {
NEXT_IMMEDIATE(ASYNC_STMT_EXECUTE_STORE_RESULT_CONT); // we continue looping
}
}
}
next_event(ASYNC_STMT_EXECUTE_STORE_RESULT_CONT);
} else {
NEXT_IMMEDIATE(ASYNC_STMT_EXECUTE_END);
}
break;
case ASYNC_STMT_EXECUTE_END:
PROXY_TRACE2();
{
if (query.stmt_result) {
unsigned long long total_size=0;
MYSQL_ROWS *r=query.stmt->result.data;
if (r) {
total_size+=r->length;
if (r->length > 0xFFFFFF) {
total_size+=(r->length / 0xFFFFFF) * sizeof(mysql_hdr);
}
total_size+=sizeof(mysql_hdr);
while(r->next) {
r=r->next;
total_size+=r->length;
if (r->length > 0xFFFFFF) {
total_size+=(r->length / 0xFFFFFF) * sizeof(mysql_hdr);
}
total_size+=sizeof(mysql_hdr);
}
}
__sync_fetch_and_add(&parent->bytes_recv,total_size);
myds->sess->thread->status_variables.stvar[st_var_queries_backends_bytes_recv]+=total_size;
myds->bytes_info.bytes_recv += total_size;
bytes_info.bytes_recv += total_size;
}
}
/*
if (interr) {
NEXT_IMMEDIATE(ASYNC_STMT_EXECUTE_FAILED);
} else {
NEXT_IMMEDIATE(ASYNC_STMT_EXECUTE_SUCCESSFUL);
}
*/
//update_warning_count_from_statement();
break;
// case ASYNC_STMT_EXECUTE_SUCCESSFUL:
// break;
// case ASYNC_STMT_EXECUTE_FAILED:
// break;
#endif // 0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This large block of code related to ASYNC_STMT_PREPARE_START through ASYNC_STMT_EXECUTE_END is currently commented out with #if 0. Is this functionality planned for a future iteration, or is it being handled differently now? If it's intended to be part of this PR, it needs to be uncommented and tested. If not, it might be cleaner to remove it to avoid confusion.

Comment on lines +389 to +397
if (header == true && _p[0] == 'P') { // Parse packet
stmt_name = _p+5;
size_t l = strlen((const char*)stmt_name);
QueryPointer = stmt_name + l + 1;
QueryLength = strlen((const char*)QueryPointer);
} else {
QueryLength=(header ? len-5 : len);
QueryPointer=(header ? _p+5 : _p);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In PgSQL_Query_Info::init, the logic to extract stmt_name, QueryPointer, and QueryLength for 'P' (Parse) packets is good.
However, if _p[0] is not 'P', stmt_name remains uninitialized (it's initialized to NULL in begin(), but init() might be called multiple times on the same PgSQL_Query_Info object if it's reused).

Is stmt_name always expected to be NULL if it's not a Parse packet, or should it be explicitly reset to NULL in the else branch here to ensure a clean state for non-Parse packets?

mybe->server_myds->statuses.questions++;
client_myds->setDSS_STATE_QUERY_SENT_NET();
}
GloPgStmt->unlock();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The GloPgStmt->unlock() is correctly placed here to ensure the lock is released whether the statement was found in cache or a new one is being prepared. However, the return; statement immediately following it means that if a new statement is being prepared (the else block from line 2229), the parsePacket (which was assigned to CurrentQuery.ParsePacket = parsePacket; if that line is uncommented, or just exists as a local variable) might not be properly managed if an error occurs later or if the session state transitions without CurrentQuery.end() or CurrentQuery destructor being called in a way that cleans it up.

If parsePacket is assigned to CurrentQuery.ParsePacket, its lifecycle is tied to CurrentQuery. If it's just a local variable here, it should be deleted if not transferred to CurrentQuery.

@rahim-kanji rahim-kanji requested a review from Copilot May 20, 2025 09:35
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR integrates PostgreSQL Prepared Statement support as part of the v3.0 merge into PR #4836. Key changes include disabling legacy MySQL prepared statement pointer handling, adding new object files for prepared statement support via the Makefile, and introducing PostgreSQL-specific classes and functions for session, protocol, and statement management.

Reviewed Changes

Copilot reviewed 39 out of 39 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
lib/MySQL_Protocol.cpp Commented out code for resetting the current prepared statement pointer.
lib/Makefile Added new object files for Base_PreparedStatement, MySQL_PreparedStatement, and PgSQL_PreparedStatement.
lib/Base_Session.cpp Changes in PgSQL session branch now instantiate a MySQL_STMTs_meta object instead of setting it to NULL.
include/PgSQL_Session.h Adjusted class definitions with new forward declarations, removed unused functions, and added several FIXME comments.
include/PgSQL_Protocol.h Introduced new PostgreSQL protocol packet classes and associated parsing functions.
include/PgSQL_Data_Stream.h Renamed class member from PgSQL_MyDS_real_query to PgDS_real_query with a FIXME comment indicating the need for a final naming update.
include/MySQL_Protocol.h Commented out a reference to current_PreStmt to disable legacy behavior.
include/MySQL_PreparedStatement.h Modified base class inheritance and removed obsolete fields.
include/Base_PreparedStatement.h Added templated base classes for handling prepared statement metadata.
deps/postgresql/*.patch and deps/Makefile Applied patches to PostgreSQL headers and updated the deps Makefile to include the new patches.
Comments suppressed due to low confidence (2)

include/PgSQL_Data_Stream.h:103

  • [nitpick] The member 'mysql_real_query' of type 'PgDS_real_query' should be renamed to better reflect its PostgreSQL context to avoid confusion.
PgDS_real_query mysql_real_query; // FIXME: rename this!!

lib/Base_Session.cpp:88

  • Instantiating 'MySQL_STMTs_meta' in the PgSQL session branch may be unexpected; please confirm that this is the intended behavior for PostgreSQL sessions.
sess_STMTs_meta = new MySQL_STMTs_meta();

@@ -123,7 +123,7 @@ void MySQL_Protocol::init(MySQL_Data_Stream **__myds, MySQL_Connection_userinfo
myds=__myds;
userinfo=__userinfo;
sess=__sess;
current_PreStmt=NULL;
//current_PreStmt=NULL;
Copy link
Preview

Copilot AI May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider removing legacy commented-out code for 'current_PreStmt' if it is no longer needed, to improve code clarity.

Suggested change
//current_PreStmt=NULL;

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants