Skip to content

Conversation

AlphaJack
Copy link
Contributor

Continues #271
I'm executing the statement before describing it also in do_describe_statement() and do_describe_portal().
I'm also allowing for more queries to be considered for query() rather than execute().

SELECT 1;

pgbench "host=localhost port=5432 password=pencil" --no-vacuum -f pgbench/select.sql -c 100 -t 100 --protocol simple
# tps = 2651.462995 (without initial connection time)


pgbench "host=localhost port=5432 password=pencil" --no-vacuum -f pgbench/select.sql -c 100 -t 100 --protocol extended
# tps = 2415.428242 (without initial connection time)

SELECT 1, version() FROM 'hf://datasets/ibm/duorc/ParaphraseRC/*.parquet' LIMIT 3;

pgbench "host=localhost port=5432 password=pencil" --no-vacuum -f pgbench/select.sql -c 3 -t 3 --protocol simple
# tps = 0.766688 (without initial connection time)


pgbench "host=localhost port=5432 password=pencil" --no-vacuum -f pgbench/select.sql -c 3 -t 3 --protocol extended
# tps = 1.063646 (without initial connection time)

Copy link
Owner

@sunng87 sunng87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good to me. Please update the Cargo.toml

let _ = stmt
.execute([])
.map_err(|e| PgWireError::ApiError(Box::new(e)))?;
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, we cannot execute it on the describe command, if the query has side-effects, this will break the system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative is to use DuckDB DESCRIBE for queries that return results:

D describe select 1, version(), false::logical;
┌───────────────────────────────────────┬─────────────┬─────────┬─────────┬─────────┬─────────┐
│              column_name              │ column_type │  null   │   key   │ default │  extra  │
│                varchar                │   varchar   │ varchar │ varchar │ varchar │ varchar │
├───────────────────────────────────────┼─────────────┼─────────┼─────────┼─────────┼─────────┤
│ 1                                     │ INTEGER     │ YES     │ NULL    │ NULL    │ NULL    │
│ "version"()                           │ VARCHAR     │ YES     │ NULL    │ NULL    │ NULL    │
│ CAST(CAST('f' AS BOOLEAN) AS BOOLEAN) │ BOOLEAN     │ YES     │ NULL    │ NULL    │ NULL    │
└───────────────────────────────────────┴─────────────┴─────────┴─────────┴─────────┴─────────┘

D select 1, version(), false::logical;
┌───────┬─────────────┬───────────────────────────────────────┐
│   1   │ "version"() │ CAST(CAST('f' AS BOOLEAN) AS BOOLEAN) │
│ int32 │   varchar   │                boolean                │
├───────┼─────────────┼───────────────────────────────────────┤
│   1   │ v1.3.0      │ false                                 │
└───────┴─────────────┴───────────────────────────────────────┘

What are the two do_describe_() expected to return for DDL queries? Because DESCRIBE doesn't support them.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It returns a NoData message in postgres wire protocol.

This approach looks good. Do you know if there is a duckdb API for this? It feels like it should be something like describe on Statement. This is what I expected column_type and column_name should return.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems duckdb does have this capability to get column structure for a statement without running it, but at least duckdb-rs didn't expose this capability as statement API. It would be nice to add it so we can call column_type/column_name either after query or after describe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where could you read that DuckDB can retrieve the column structure for a statement without running it?
In the C API, it seems that duckdb_column_count() and other functions still need a duckdb_result pointer generated by duckdb_query() (doc PDF page 54, doc website).

On the other hand, using DESCRIBE would mean having a new map for type string to FieldInfo datatype

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't read from their API. It's just because they are capable with this DESCRIBE statement that returns column structures without executing the statement, which in theory can be implemented as an API.

I agree using DESCRIBE in this scenario can be a little tricky. duckdb-rs has a Type enumeration but we will need to translate it into arrow types then to postgres types.

By the way, I'm going to ask upstream if it's possible to add describe API to statement.

Copy link
Owner

@sunng87 sunng87 Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feature request thread in duckdb: duckdb/duckdb#17951

Copy link
Owner

@sunng87 sunng87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I missed the describe implementation, we cannot run the query in describe because it will actually run it twice in this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants