Skip to content

Conversation

justin2004
Copy link

when a spec file has an input parameter like:

            "schema": {
              "format": "date-time",
              "type": "string"

and you need to

select * from somecatalog.default.someendpoint where ts = TIMESTAMP '2025-04-18 12:30:45.123'

without this PR you'll get:

Query 20250421_163824_00171_9ag3c, FAILED, 1 node
Splits: 1 total, 0 done (0.00%)
0.12 [0 rows, 0B] [0 rows/s, 0B/s]

Query 20250421_163824_00171_9ag3c failed: Unsupported field: YearOfEra

with a trace like this:

jshell> dateFormatter.format(Instant.ofEpochMilli(1745254365000L))
|  Exception java.time.temporal.UnsupportedTemporalTypeException: Unsupported field: YearOfEra
|        at Instant.getLong (Instant.java:604)
|        at DateTimePrintContext.getValue (DateTimePrintContext.java:308)
|        at DateTimeFormatterBuilder$NumberPrinterParser.format (DateTimeFormatterBuilder.java:2763)
|        at DateTimeFormatterBuilder$CompositePrinterParser.format (DateTimeFormatterBuilder.java:2402)
|        at DateTimeFormatter.formatTo (DateTimeFormatter.java:1849)
|        at DateTimeFormatter.format (DateTimeFormatter.java:1823)
|        at (#33:1)

but this PR allows this:

jshell> dateFormatter.format(Instant.ofEpochMilli(1745249894000L).atZone(ZoneOffset.UTC))
$35 ==> "2025-04-21T15:38:14.000Z"

@justin2004
Copy link
Author

justin2004 commented Apr 21, 2025

wait a moment something is off...

when i do

select * from cat.default.end where ts = TIMESTAMP '2025-04-18 12:30:45.123';

the rest client emits a call (which it wasn't before) but the param it emits is:

ts=+57266-03-13T08:32:03.000000000.000000.000Z

but i need the rest client to emit something like:

ts=2024-04-01T12%3A00%3A00.000000Z

which i believe is RFC 3339, section 5.6 conforming

@nineinchnick
Copy link
Owner

The issue is more complex. The implementation of ‎OpenApiClient::getFilter is not correct for complex types, like timestamps. I'm not super familiar with type coercion that Trino does for predicates, so I'm not sure where the type information comes from for the domain values. I'd assume Trino coerces these values to the correct column type.

We should use the column's type, not only the source type (Schema object from the spec), to correctly encode the value. This would be the reverse of https://github.com/nineinchnick/trino-openapi/blob/main/src/main/java/pl/net/was/JsonTrinoConverter.java#L254, which also doesn't cover all timestamp types.

I haven't done it yet, since it's quite a lot of work to cover all timestamp types supported in Trino, including edge cases for weird timezones. We can write a bunch of unit tests, covering the same cases as in https://github.com/trinodb/trino/blob/master/plugin/trino-postgresql/src/test/java/io/trino/plugin/postgresql/TestPostgreSqlTypeMapping.java#L1399
We can also add some e2e tests by extending https://github.com/nineinchnick/trino-openapi/blob/main/src/test/resources/fastapi/main.py

I'm not asking you to do all of this, just dumping what I remember here. We should create an issue for this, and start hacking at it incrementally.

@nineinchnick
Copy link
Owner

I created an issue for this #161

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants