Fix/register partitioned tables in glue #523

firewall413 · 2025-03-20T15:47:31Z

resolved issue when writing a new partition to glue for a table with the same schema would result in multiple glue schema versions (instead of just adding a partition)
resolved issue where when writing new partition to Glue: The schema of the file at external_read_location //*.parquet would be picked up instead of actual last written partition year=2024/month=03/day=20/*.parquet

… will update the Glue schema to the schema of the latest partition (which in turn allows for append-only schema evolution in later partitions, this is supported by Glue)

…glue

…sult in multiple schema versions (instead of just adding a partition)

jwills · 2025-03-20T15:50:07Z

@firewall413, hey nice to see you-- and thanks for putting this together!

firewall413 · 2025-03-20T16:02:45Z

Still working on those failing tests, my bad, I'll finish up after my holidays!

…oint Remove duckdbt entrypoint for now

Add info on the interactive shell for the DuckDB UI to the README

Bumps [dbt-tests-adapter](https://github.com/dbt-labs/dbt-adapters) from 1.11.0 to 1.15.0. - [Release notes](https://github.com/dbt-labs/dbt-adapters/releases) - [Commits](https://github.com/dbt-labs/dbt-adapters/commits) --- updated-dependencies: - dependency-name: dbt-tests-adapter dependency-version: 1.15.0 dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]>

…apter-1.15.0 Bump dbt-tests-adapter from 1.11.0 to 1.15.0

Apply project quoting setting to add/remove columns

As per the [DuckDB docs](https://duckdb.org/docs/stable/extensions/postgres.html#managing-multiple-secrets), a secret can be given a name (which is already possible) and this secret can be referenced in the attach statement. This is convenient if you want to e.g. attach multiple postgres databases

Add named secret parameter to attachment class for profiles.yml

…d up all of a sudden

…lis/read-.duckdbrc-file-on-duckdbt-cli-startup Read .duckdbrc file on DuckDBT CLI startup

…failures See if this is related to the mysterious doc test failures that popped up on master

Bumps [freezegun](https://github.com/spulec/freezegun) from 1.5.1 to 1.5.2. - [Release notes](https://github.com/spulec/freezegun/releases) - [Changelog](https://github.com/spulec/freezegun/blob/master/CHANGELOG) - [Commits](spulec/freezegun@1.5.1...1.5.2) --- updated-dependencies: - dependency-name: freezegun dependency-version: 1.5.2 dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]>

Enhances the Attachment dataclass to support arbitrary ATTACH statement options via an options dict, enabling users to leverage new DuckDB attachment features without waiting for explicit dbt-duckdb support. Key changes: - Add optional options field to Attachment dataclass for arbitrary key-value pairs - Update to_sql method to handle options dict with conflict detection - Maintain full backward compatibility with existing type, secret, read_only fields - Add comprehensive tests covering new functionality and edge cases - Update README with documentation and examples of the new feature - Remove outdated information about DuckDB 0.7.0 and run hooks from docs 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

Bump freezegun from 1.5.1 to 1.5.2

Allow the database field in credentials to match any alias from the attach list, enabling users to specify an attached database as their primary database instead of being restricted to the path-derived database name. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

…y_kv Add support for arbitrary key-value pairs in Attachment ATTACH options

@available

…uckdb#557) * feat: add conditional cascade removal for ducklake relations with logging * Remove cascade for testing * Introduce error for testing * fix: correct drop_relation macro with proper syntax and cascade handling * refactor: update duckdb drop_relation macro to handle path object safely * refactor: simplify ducklake relation detection in drop_relation macro * refactor: update drop_relation macro to check ducklake path directly * fix: modify ducklake path detection in drop_relation macro * The changes look good. I'll summarize the key modifications: 1. For both `duckdb__drop_relation` and `duckdb__drop_schema` macros: - Added a namespace approach to track whether the relation/schema is associated with a ducklake attachment - Check if the relation's database matches an attachment's alias - Check if the attachment path contains 'ducklake:' - If a ducklake attachment is found, drop without cascade - Otherwise, drop with cascade 2. Removed previous hardcoded checks for 'ducklake:' in the database name 3. Used a more robust method to detect ducklake-related relations/schemas The implementation ensures that: - Ducklake-attached databases are handled differently - The check is performed dynamically based on the profile configuration - The existing cascade behavior is preserved for non-ducklake databases Commit message: refactor: improve ducklake detection in drop relation and schema macros * refactor: remove log statements from drop macros * refactor: extract ducklake detection into reusable helper macro - Create is_ducklake_database() helper macro to centralize ducklake detection logic - Eliminate code duplication between drop_schema and drop_relation macros - Improve maintainability by having single source of truth for ducklake checks - Use early return optimization in helper macro for better performance 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> * refactor: move ducklake detection to adapter class with @available decorator - Add is_ducklake() method to DuckDBAdapter class as @available method - Remove Jinja macro approach in favor of proper Python implementation - Pass full relation object to adapter method for better type safety - Improve testability by moving logic to adapter class - Follow dbt-duckdb conventions for adapter functionality Addresses maintainer feedback to use @available adapter methods for complex logic. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> * test: add comprehensive unit tests for is_ducklake adapter method - Test all scenarios: no attach config, empty config, ducklake paths, regular paths - Test edge cases: None relation, missing database, missing alias, empty path - Test mixed attachment configurations with multiple database types - All 9 test cases pass, ensuring robust ducklake detection logic 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> * fix: update tests to use correct ducklake path format - Change test paths from "ducklake:s3://..." to "ducklake:sqlite:storage/metadata.sqlite" - Use proper ducklake format as shown in real profiles.yml examples - All 9 tests still pass with corrected path formats - Ensures tests match actual ducklake usage patterns 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> * Revert drop schema * perf: optimize is_ducklake with efficient set lookup - Add _ducklake_dbs set to DuckDBCredentials.__post_init__ for O(1) lookup - Simplify is_ducklake method to use set membership instead of O(n) iteration - Maintains same functionality while minimizing overhead for non-ducklake projects 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> --------- Co-authored-by: Claude <[email protected]>

I believe this fixes the root cause of duckdb#513 ## Problem Currently, when using MotherDuck with dbt-duckdb, a real temp schema and real tables will be created to store the staging tables, because temp tables are not yet supported. By default this schema is called `dbt_temp`. When running dbt with 2 or more threads, each incremental model will open a transaction that starts with the statement `CREATE SCHEMA IF NOT EXISTS dbt_temp`. The model that gets executed first will cause a catalog change with the `CREATE SCHEMA` statement while its transaction is still open, while the rest of the models will fail with a write-write conflict on that statement. ## Solution & Considerations If the schema `dbt_temp` already exists, then the `CREATE SCHEMA` statement will be a no-op, then all incremental models will succeed. So ideally, the create schema statement should be run just once at the beginning of a dbt run. But the current implementation allows each model to configure their own temp schema. This means that we won't fully know the name of the schema that needs to be created until processing the model, also that the`CREATE SCHEMA IF NOT EXISTS {temp_schema_name}` statement in `macros/materializations/incremental.sql` needs to be preserved. So the fix will just address the problem of creating the default temp schema `dbt_temp`, by running `CREATE SCHEMA IF NOT EXISTS dbt_temp` when initializing the connection.

--- updated-dependencies: - dependency-name: sigstore/gh-action-sigstore-python dependency-version: 3.0.1 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [mypy](https://github.com/python/mypy) from 1.16.0 to 1.16.1. - [Changelog](https://github.com/python/mypy/blob/master/CHANGELOG.md) - [Commits](python/mypy@v1.16.0...v1.16.1) --- updated-dependencies: - dependency-name: mypy dependency-version: 1.16.1 dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…existence of upstream_nodes (duckdb#588)

Bumps [freezegun](https://github.com/spulec/freezegun) from 1.5.2 to 1.5.3. - [Release notes](https://github.com/spulec/freezegun/releases) - [Changelog](https://github.com/spulec/freezegun/blob/master/CHANGELOG) - [Commits](spulec/freezegun@1.5.2...1.5.3) --- updated-dependencies: - dependency-name: freezegun dependency-version: 1.5.3 dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

--- updated-dependencies: - dependency-name: pyarrow dependency-version: 21.0.0 dependency-type: direct:development update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add struct column support with BigQuery-style flatten functionality - Add fields property to DuckDBColumn to support nested struct columns - Implement create_from_dict and create factory methods for column creation - Add is_struct method to detect struct data types - Add flatten method to recursively flatten nested struct columns into dot-notation columns - Update tests to use new factory methods and add comprehensive struct tests - Fix return type annotation in get_column_schema_from_query method 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> * Simplify struct column implementation - Remove unnecessary factory methods and auto-conversion logic - Use direct constructor calls instead of create() methods - Eliminate __post_init__ and create_from_dict complexity - Simplify flatten() method to use basic constructor - Update tests to use simplified API 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> * Fix lint error: avoid shadowing 'field' import with loop variable * feat: Add struct flattening to get_column_schema_from_query This commit introduces automatic flattening of STRUCT data types within the `get_column_schema_from_query` method. Previously, when a query returned a STRUCT, it was treated as a single, opaque column. This change recursively unnests STRUCTs, creating a separate column for each field, with names reflecting their nested path (e.g., `struct_column.field_name`). Changes include: - A `__post_init__` method was added to the `DuckDBColumn` class to parse STRUCT type definitions and build a nested representation of their fields. - The `flatten` method in `DuckDBColumn` was updated to recursively traverse the nested structure and produce a flat list of columns. - The `get_column_schema_from_query` method in `DuckDBAdapter` was updated to use the new `DuckDBColumn` functionality. - Tests for `DuckDBColumn` were updated to reflect the new parsing logic and to be more concise. * undo those changes * rm print in test --------- Co-authored-by: Claude <[email protected]>

--- updated-dependencies: - dependency-name: mypy dependency-version: 1.17.0 dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [dbt-tests-adapter](https://github.com/dbt-labs/dbt-adapters) from 1.16.0 to 1.17.0. - [Release notes](https://github.com/dbt-labs/dbt-adapters/releases) - [Commits](https://github.com/dbt-labs/dbt-adapters/commits) --- updated-dependencies: - dependency-name: dbt-tests-adapter dependency-version: 1.17.0 dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [mypy](https://github.com/python/mypy) from 1.17.0 to 1.17.1. - [Changelog](https://github.com/python/mypy/blob/master/CHANGELOG.md) - [Commits](python/mypy@v1.17.0...v1.17.1) --- updated-dependencies: - dependency-name: mypy dependency-version: 1.17.1 dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 4 to 5. - [Release notes](https://github.com/actions/download-artifact/releases) - [Commits](actions/download-artifact@v4...v5) --- updated-dependencies: - dependency-name: actions/download-artifact dependency-version: '5' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [freezegun](https://github.com/spulec/freezegun) from 1.5.3 to 1.5.5. - [Release notes](https://github.com/spulec/freezegun/releases) - [Changelog](https://github.com/spulec/freezegun/blob/master/CHANGELOG) - [Commits](spulec/freezegun@1.5.3...1.5.5) --- updated-dependencies: - dependency-name: freezegun dependency-version: 1.5.5 dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix secrets map/list type handling in DuckDB SQL generation Add support for proper DuckDB map and array syntax when converting dict and list types in secret parameters to SQL. Previously, these complex types were incorrectly wrapped in quotes, causing syntax errors like "Parser Error: syntax error at or near 'TYPE'". Changes: - Add _format_value() method to handle dict, list, and scalar types - Dict values now generate: map {'key': 'value', 'key2': 'value2'} - List values now generate: array ['item1', 'item2', 'item3'] - Scalar values maintain existing behavior with proper quoting This fixes issues with ducklake secrets that use metadata_parameters as a map type, enabling proper conversion from YAML to DuckDB SQL. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> * Remove redundant test assertions for map/list secrets Removed trivial property access assertions in favor of focusing on the meaningful SQL generation verification. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> --------- Co-authored-by: Claude <[email protected]>

Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](actions/checkout@v4...v5) --- updated-dependencies: - dependency-name: actions/checkout dependency-version: '5' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…bility (duckdb#612) Replace logger.exception with logger.warning + traceback formatting to preserve stack trace information while using WARNING level instead of ERROR level. This prevents orchestrators like Dagster from treating the log entry as a fatal exception that requires process termination. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <[email protected]>

… will update the Glue schema to the schema of the latest partition (which in turn allows for append-only schema evolution in later partitions, this is supported by Glue)

…sult in multiple schema versions (instead of just adding a partition)

…hub.com/firewall413/dbt-duckdb into fix/register-partitioned-tables-in-glue

> cursor.execute("CREATE TABLE tt1 (id int, name text)") E sqlite3.OperationalError: table tt1 already exists

firewall413 · 2025-09-01T14:57:41Z

Would it be possible to have a look at the Motherduck test? It seems to be blocking this PR

hrl20 · 2025-09-02T13:09:39Z

Would it be possible to have a look at the Motherduck test? It seems to be blocking this PR

i'll take a look after this run completes.

Arno Roos and others added 5 commits January 24, 2025 15:58

added partition_columns new argument for external_read_location. This…

f095d71

… will update the Glue schema to the schema of the latest partition (which in turn allows for append-only schema evolution in later partitions, this is supported by Glue)

adding a partition delimiter for more generic partitioning

ec85431

set default delimiter

21f852c

Merge branch 'duckdb:master' into fix/register-partitioned-tables-in-…

63b69b6

…glue

resolve issue where multiple glue updates of the same schema would re…

e151b60

…sult in multiple schema versions (instead of just adding a partition)

Guen Prawiroatmodjo and others added 23 commits April 21, 2025 08:42

Fix: turn transactions back on by default for MotherDuck (duckdb#540)

ea2087e

remove duckdbt entrypoint for now

b06054b

Merge pull request duckdb#541 from duckdb/guenp/remove-duckdbt-entryp…

612da77

…oint Remove duckdbt entrypoint for now

bump duckdb version, patch test (duckdb#542)

c9d14cc

Add info on the interactive shell for the DuckDB UI

61581ef

Merge pull request duckdb#543 from duckdb/jwills_readme_docs_for_cli

5d61856

Add info on the interactive shell for the DuckDB UI to the README

add tests and quoting - reset to single commit

c2dea44

Merge pull request duckdb#549 from duckdb/dependabot/pip/dbt-tests-ad…

4c19cc4

…apter-1.15.0 Bump dbt-tests-adapter from 1.11.0 to 1.15.0

Merge pull request duckdb#548 from gregwdata/quote_columns_add_remove

4bd70b7

Apply project quoting setting to add/remove columns

Read .duckdbrc file on DuckDBT CLI startup

d69913a

Merge pull request duckdb#550 from dumkydewilde/patch-2

03b7050

Add named secret parameter to attachment class for profiles.yml

See if this is related to the mysterious doc test failures that poppe…

506758a

…d up all of a sudden

Merge pull request duckdb#553 from elephantmetropolis/elephantmetropo…

18f69ea

…lis/read-.duckdbrc-file-on-duckdbt-cli-startup Read .duckdbrc file on DuckDBT CLI startup

Merge pull request duckdb#554 from duckdb/jwills_mysterious_doc_test_…

9513f40

…failures See if this is related to the mysterious doc test failures that popped up on master

lint stuff

4e2102f

Merge pull request duckdb#555 from duckdb/dependabot/pip/freezegun-1.5.2

6a65e9a

Bump freezegun from 1.5.1 to 1.5.2

Merge pull request duckdb#556 from duckdb/jwills_attach_with_arbitrar…

eccb165

…y_kv Add support for arbitrary key-value pairs in Attachment ATTACH options

hrl20 and others added 25 commits June 24, 2025 16:01

Make the adapter.commit call inside of upstream.sql dependent on the …

3326bc8

…existence of upstream_nodes (duckdb#588)

Update incremental.sql (duckdb#591)

f8219ed

Update excel plugin to filter all null rows (duckdb#615)

76dd0f2

fixed CreatePartition operation: Partition already exists.

88c088d

added partition_columns new argument for external_read_location. This…

7b7033d

… will update the Glue schema to the schema of the latest partition (which in turn allows for append-only schema evolution in later partitions, this is supported by Glue)

adding a partition delimiter for more generic partitioning

e93a58e

set default delimiter

4a36804

resolve issue where multiple glue updates of the same schema would re…

501c4b2

…sult in multiple schema versions (instead of just adding a partition)

fixed CreatePartition operation: Partition already exists.

c9c8a64

Merge branch 'fix/register-partitioned-tables-in-glue' of https://git…

b951d16

…hub.com/firewall413/dbt-duckdb into fix/register-partitioned-tables-in-glue

fixed test:

760b83a

> cursor.execute("CREATE TABLE tt1 (id int, name text)") E sqlite3.OperationalError: table tt1 already exists

added check if arrary type to dbt2glue

19e527b

Arno Roos added 2 commits September 3, 2025 13:02

added decimal

feffc1f

Convert both column lists to lowercase for case-insensitive comparison

1b69f04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix/register partitioned tables in glue #523

Fix/register partitioned tables in glue #523

Uh oh!

firewall413 commented Mar 20, 2025

Uh oh!

jwills commented Mar 20, 2025

Uh oh!

firewall413 commented Mar 20, 2025

Uh oh!

firewall413 commented Sep 1, 2025

Uh oh!

hrl20 commented Sep 2, 2025

Uh oh!

Uh oh!

Fix/register partitioned tables in glue #523

Are you sure you want to change the base?

Fix/register partitioned tables in glue #523

Uh oh!

Conversation

firewall413 commented Mar 20, 2025

Uh oh!

jwills commented Mar 20, 2025

Uh oh!

firewall413 commented Mar 20, 2025

Uh oh!

firewall413 commented Sep 1, 2025

Uh oh!

hrl20 commented Sep 2, 2025

Uh oh!

Uh oh!