Skip to content

Conversation

firewall413
Copy link

  • resolved issue when writing a new partition to glue for a table with the same schema would result in multiple glue schema versions (instead of just adding a partition)
  • resolved issue where when writing new partition to Glue: The schema of the file at external_read_location //*.parquet would be picked up instead of actual last written partition year=2024/month=03/day=20/*.parquet

Arno Roos and others added 5 commits January 24, 2025 15:58
… will update the Glue schema to the schema of the latest partition (which in turn allows for append-only schema evolution in later partitions, this is supported by Glue)
…sult in multiple schema versions (instead of just adding a partition)
@jwills
Copy link
Collaborator

jwills commented Mar 20, 2025

@firewall413, hey nice to see you-- and thanks for putting this together!

@firewall413
Copy link
Author

Still working on those failing tests, my bad, I'll finish up after my holidays!

Guen Prawiroatmodjo and others added 23 commits April 21, 2025 08:42
Add info on the interactive shell for the DuckDB UI to the README
Bumps [dbt-tests-adapter](https://github.com/dbt-labs/dbt-adapters) from 1.11.0 to 1.15.0.
- [Release notes](https://github.com/dbt-labs/dbt-adapters/releases)
- [Commits](https://github.com/dbt-labs/dbt-adapters/commits)

---
updated-dependencies:
- dependency-name: dbt-tests-adapter
  dependency-version: 1.15.0
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
…apter-1.15.0

Bump dbt-tests-adapter from 1.11.0 to 1.15.0
Apply project quoting setting to add/remove columns
As per the [DuckDB docs](https://duckdb.org/docs/stable/extensions/postgres.html#managing-multiple-secrets), a secret can be given a name (which is already possible) and this secret can be referenced in the attach statement. This is convenient if you want to e.g. attach multiple postgres databases
Add named secret parameter to attachment class for profiles.yml
…lis/read-.duckdbrc-file-on-duckdbt-cli-startup

Read .duckdbrc file on DuckDBT CLI startup
…failures

See if this is related to the mysterious doc test failures that popped up on master
Bumps [freezegun](https://github.com/spulec/freezegun) from 1.5.1 to 1.5.2.
- [Release notes](https://github.com/spulec/freezegun/releases)
- [Changelog](https://github.com/spulec/freezegun/blob/master/CHANGELOG)
- [Commits](spulec/freezegun@1.5.1...1.5.2)

---
updated-dependencies:
- dependency-name: freezegun
  dependency-version: 1.5.2
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Enhances the Attachment dataclass to support arbitrary ATTACH statement options via an options dict, enabling users to leverage new DuckDB attachment features without waiting for explicit dbt-duckdb support.

Key changes:
- Add optional options field to Attachment dataclass for arbitrary key-value pairs
- Update to_sql method to handle options dict with conflict detection
- Maintain full backward compatibility with existing type, secret, read_only fields
- Add comprehensive tests covering new functionality and edge cases
- Update README with documentation and examples of the new feature
- Remove outdated information about DuckDB 0.7.0 and run hooks from docs

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Allow the database field in credentials to match any alias from the attach list,
enabling users to specify an attached database as their primary database instead
of being restricted to the path-derived database name.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…y_kv

Add support for arbitrary key-value pairs in Attachment ATTACH options
…uckdb#557)

* feat: add conditional cascade removal for ducklake relations with logging

* Remove cascade for testing

* Introduce error for testing

* fix: correct drop_relation macro with proper syntax and cascade handling

* refactor: update duckdb drop_relation macro to handle path object safely

* refactor: simplify ducklake relation detection in drop_relation macro

* refactor: update drop_relation macro to check ducklake path directly

* fix: modify ducklake path detection in drop_relation macro

* The changes look good. I'll summarize the key modifications:

1. For both `duckdb__drop_relation` and `duckdb__drop_schema` macros:
   - Added a namespace approach to track whether the relation/schema is associated with a ducklake attachment
   - Check if the relation's database matches an attachment's alias
   - Check if the attachment path contains 'ducklake:'
   - If a ducklake attachment is found, drop without cascade
   - Otherwise, drop with cascade

2. Removed previous hardcoded checks for 'ducklake:' in the database name
3. Used a more robust method to detect ducklake-related relations/schemas

The implementation ensures that:
- Ducklake-attached databases are handled differently
- The check is performed dynamically based on the profile configuration
- The existing cascade behavior is preserved for non-ducklake databases

Commit message: refactor: improve ducklake detection in drop relation and schema macros

* refactor: remove log statements from drop macros

* refactor: extract ducklake detection into reusable helper macro

- Create is_ducklake_database() helper macro to centralize ducklake detection logic
- Eliminate code duplication between drop_schema and drop_relation macros
- Improve maintainability by having single source of truth for ducklake checks
- Use early return optimization in helper macro for better performance

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>

* refactor: move ducklake detection to adapter class with @available decorator

- Add is_ducklake() method to DuckDBAdapter class as @available method
- Remove Jinja macro approach in favor of proper Python implementation
- Pass full relation object to adapter method for better type safety
- Improve testability by moving logic to adapter class
- Follow dbt-duckdb conventions for adapter functionality

Addresses maintainer feedback to use @available adapter methods for complex logic.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>

* test: add comprehensive unit tests for is_ducklake adapter method

- Test all scenarios: no attach config, empty config, ducklake paths, regular paths
- Test edge cases: None relation, missing database, missing alias, empty path
- Test mixed attachment configurations with multiple database types
- All 9 test cases pass, ensuring robust ducklake detection logic

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>

* fix: update tests to use correct ducklake path format

- Change test paths from "ducklake:s3://..." to "ducklake:sqlite:storage/metadata.sqlite"
- Use proper ducklake format as shown in real profiles.yml examples
- All 9 tests still pass with corrected path formats
- Ensures tests match actual ducklake usage patterns

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>

* Revert drop schema

* perf: optimize is_ducklake with efficient set lookup

- Add _ducklake_dbs set to DuckDBCredentials.__post_init__ for O(1) lookup
- Simplify is_ducklake method to use set membership instead of O(n) iteration
- Maintains same functionality while minimizing overhead for non-ducklake projects

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>

---------

Co-authored-by: Claude <[email protected]>
hrl20 and others added 25 commits June 24, 2025 16:01
I believe this fixes the root cause of duckdb#513 

## Problem
Currently, when using MotherDuck with dbt-duckdb, a real temp schema and real tables will be created to store the staging tables, because temp tables are not yet supported. By default this schema is called `dbt_temp`. When running dbt with 2 or more threads, each incremental model will open a transaction that starts with the statement `CREATE SCHEMA IF NOT EXISTS dbt_temp`. The model that gets executed first will cause a catalog change with the `CREATE SCHEMA` statement while its transaction is still open, while the rest of the models will fail with a write-write conflict on that statement. 

## Solution & Considerations
If the schema `dbt_temp` already exists, then the `CREATE SCHEMA` statement will be a no-op, then all incremental models will succeed. 
So ideally, the create schema statement should be run just once at the beginning of a dbt run. But the current implementation allows each model to configure their own temp schema. This means that we won't fully know the name of the schema that needs to be created until processing the model, also that the`CREATE SCHEMA IF NOT EXISTS {temp_schema_name}` statement in `macros/materializations/incremental.sql` needs to be preserved. 
So the fix will just address the problem of creating the default temp schema `dbt_temp`, by running `CREATE SCHEMA IF NOT EXISTS  dbt_temp` when initializing the connection.
---
updated-dependencies:
- dependency-name: sigstore/gh-action-sigstore-python
  dependency-version: 3.0.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [mypy](https://github.com/python/mypy) from 1.16.0 to 1.16.1.
- [Changelog](https://github.com/python/mypy/blob/master/CHANGELOG.md)
- [Commits](python/mypy@v1.16.0...v1.16.1)

---
updated-dependencies:
- dependency-name: mypy
  dependency-version: 1.16.1
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [freezegun](https://github.com/spulec/freezegun) from 1.5.2 to 1.5.3.
- [Release notes](https://github.com/spulec/freezegun/releases)
- [Changelog](https://github.com/spulec/freezegun/blob/master/CHANGELOG)
- [Commits](spulec/freezegun@1.5.2...1.5.3)

---
updated-dependencies:
- dependency-name: freezegun
  dependency-version: 1.5.3
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
updated-dependencies:
- dependency-name: pyarrow
  dependency-version: 21.0.0
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Add struct column support with BigQuery-style flatten functionality

- Add fields property to DuckDBColumn to support nested struct columns
- Implement create_from_dict and create factory methods for column creation
- Add is_struct method to detect struct data types
- Add flatten method to recursively flatten nested struct columns into dot-notation columns
- Update tests to use new factory methods and add comprehensive struct tests
- Fix return type annotation in get_column_schema_from_query method

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>

* Simplify struct column implementation

- Remove unnecessary factory methods and auto-conversion logic
- Use direct constructor calls instead of create() methods
- Eliminate __post_init__ and create_from_dict complexity
- Simplify flatten() method to use basic constructor
- Update tests to use simplified API

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>

* Fix lint error: avoid shadowing 'field' import with loop variable

* feat: Add struct flattening to get_column_schema_from_query

This commit introduces automatic flattening of STRUCT data types within the `get_column_schema_from_query` method.

Previously, when a query returned a STRUCT, it was treated as a single, opaque column. This change recursively unnests STRUCTs, creating a separate column for each field, with names reflecting their nested path (e.g., `struct_column.field_name`).

Changes include:
- A `__post_init__` method was added to the `DuckDBColumn` class to parse STRUCT type definitions and build a nested representation of their fields.
- The `flatten` method in `DuckDBColumn` was updated to recursively traverse the nested structure and produce a flat list of columns.
- The `get_column_schema_from_query` method in `DuckDBAdapter` was updated to use the new `DuckDBColumn` functionality.
- Tests for `DuckDBColumn` were updated to reflect the new parsing logic and to be more concise.

* undo those changes

* rm print in test

---------

Co-authored-by: Claude <[email protected]>
---
updated-dependencies:
- dependency-name: mypy
  dependency-version: 1.17.0
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [dbt-tests-adapter](https://github.com/dbt-labs/dbt-adapters) from 1.16.0 to 1.17.0.
- [Release notes](https://github.com/dbt-labs/dbt-adapters/releases)
- [Commits](https://github.com/dbt-labs/dbt-adapters/commits)

---
updated-dependencies:
- dependency-name: dbt-tests-adapter
  dependency-version: 1.17.0
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [mypy](https://github.com/python/mypy) from 1.17.0 to 1.17.1.
- [Changelog](https://github.com/python/mypy/blob/master/CHANGELOG.md)
- [Commits](python/mypy@v1.17.0...v1.17.1)

---
updated-dependencies:
- dependency-name: mypy
  dependency-version: 1.17.1
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 4 to 5.
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](actions/download-artifact@v4...v5)

---
updated-dependencies:
- dependency-name: actions/download-artifact
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [freezegun](https://github.com/spulec/freezegun) from 1.5.3 to 1.5.5.
- [Release notes](https://github.com/spulec/freezegun/releases)
- [Changelog](https://github.com/spulec/freezegun/blob/master/CHANGELOG)
- [Commits](spulec/freezegun@1.5.3...1.5.5)

---
updated-dependencies:
- dependency-name: freezegun
  dependency-version: 1.5.5
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Fix secrets map/list type handling in DuckDB SQL generation

Add support for proper DuckDB map and array syntax when converting
dict and list types in secret parameters to SQL. Previously, these
complex types were incorrectly wrapped in quotes, causing syntax
errors like "Parser Error: syntax error at or near 'TYPE'".

Changes:
- Add _format_value() method to handle dict, list, and scalar types
- Dict values now generate: map {'key': 'value', 'key2': 'value2'}
- List values now generate: array ['item1', 'item2', 'item3']
- Scalar values maintain existing behavior with proper quoting

This fixes issues with ducklake secrets that use metadata_parameters
as a map type, enabling proper conversion from YAML to DuckDB SQL.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>

* Remove redundant test assertions for map/list secrets

Removed trivial property access assertions in favor of focusing on the
meaningful SQL generation verification.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>

---------

Co-authored-by: Claude <[email protected]>
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](actions/checkout@v4...v5)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…bility (duckdb#612)

Replace logger.exception with logger.warning + traceback formatting to preserve
stack trace information while using WARNING level instead of ERROR level.
This prevents orchestrators like Dagster from treating the log entry as a fatal
exception that requires process termination.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: Claude <[email protected]>
… will update the Glue schema to the schema of the latest partition (which in turn allows for append-only schema evolution in later partitions, this is supported by Glue)
…sult in multiple schema versions (instead of just adding a partition)
>       cursor.execute("CREATE TABLE tt1 (id int, name text)")
E       sqlite3.OperationalError: table tt1 already exists
@firewall413
Copy link
Author

Would it be possible to have a look at the Motherduck test? It seems to be blocking this PR

@hrl20
Copy link
Collaborator

hrl20 commented Sep 2, 2025

Would it be possible to have a look at the Motherduck test? It seems to be blocking this PR

i'll take a look after this run completes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants