Metaxy + DuckLake¶
Experimental
This functionality is experimental.
DuckLake is a modern LakeHouse which uses a relational database as metadata catalog.
Currently, there is only one production-ready implementation of DuckLake - via DuckDB, and the built-in DuckDBMetadataStore can be configured to use DuckLake as its storage backend. Learn more about the DuckDB integration here.
Configuration¶
There are two main parts that configure DuckLake: a catalog (where the transaction log and other metadata is stored) and a storage (where the data files (1) live).
- Parquet files
Each piece of configuration that manages secrets (e.g. PostgreSQL, S3, R2, GCS) requires a secret_name parameter. Metaxy uses this name to either create a new DuckDB secret (when inline credentials are provided) or reference a pre-existing one (when only the name is given).
Tip
To use the credential chain (IAM roles, environment variables, etc.) instead of static S3 credentials, set secret_parameters = { provider = "credential_chain" }.
Learn more in DuckDB docs.
Example Configuration
[stores.dev]
type = "metaxy.ext.duckdb.DuckDBMetadataStore"
[stores.dev.config.ducklake.catalog]
type = "postgres"
secret_name = "my_pg_secret"
host = "localhost"
port = 5432
database = "ducklake_meta"
user = "ducklake"
password = "changeme"
[stores.dev.config.ducklake.storage]
type = "s3"
secret_name = "my_s3_secret"
bucket = "my-ducklake-bucket"
key_id = "AKIA..."
secret = "..."
region = "eu-central-1"
See the DuckLake example to learn more.
DuckLake attachment configuration for a DuckDB connection.
Show JSON schema:
{
"$defs": {
"DuckDBCatalogConfig": {
"description": "DuckDB file-based metadata backend for [DuckLake](https://ducklake.select/).",
"properties": {
"type": {
"const": "duckdb",
"default": "duckdb",
"title": "Type",
"type": "string"
},
"uri": {
"title": "Uri",
"type": "string"
}
},
"required": [
"uri"
],
"title": "DuckDBCatalogConfig",
"type": "object"
},
"GCSStorageConfig": {
"description": "Google Cloud Storage backend for [DuckLake](https://ducklake.select/).\n\nUses the DuckDB [`TYPE GCS`](https://duckdb.org/docs/stable/core_extensions/httpfs/s3api#gcs-secrets) secret.",
"properties": {
"type": {
"const": "gcs",
"default": "gcs",
"title": "Type",
"type": "string"
},
"key_id": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Key Id"
},
"secret": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Secret"
},
"data_path": {
"title": "Data Path",
"type": "string"
},
"secret_name": {
"title": "Secret Name",
"type": "string"
},
"secret_parameters": {
"anyOf": [
{
"additionalProperties": true,
"type": "object"
},
{
"type": "null"
}
],
"default": null,
"title": "Secret Parameters"
}
},
"required": [
"data_path",
"secret_name"
],
"title": "GCSStorageConfig",
"type": "object"
},
"LocalStorageConfig": {
"description": "Local filesystem storage backend for DuckLake.",
"properties": {
"type": {
"const": "local",
"default": "local",
"title": "Type",
"type": "string"
},
"path": {
"title": "Path",
"type": "string"
}
},
"required": [
"path"
],
"title": "LocalStorageConfig",
"type": "object"
},
"MotherDuckCatalogConfig": {
"description": "[MotherDuck](https://motherduck.com/)-managed metadata backend for [DuckLake](https://ducklake.select/).",
"properties": {
"type": {
"const": "motherduck",
"default": "motherduck",
"title": "Type",
"type": "string"
},
"database": {
"title": "Database",
"type": "string"
},
"region": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "AWS region of the MotherDuck-managed S3 storage (e.g. 'eu-central-1').",
"title": "Region"
}
},
"required": [
"database"
],
"title": "MotherDuckCatalogConfig",
"type": "object"
},
"PostgresCatalogConfig": {
"description": "PostgreSQL metadata backend for [DuckLake](https://ducklake.select/).",
"properties": {
"type": {
"const": "postgres",
"default": "postgres",
"title": "Type",
"type": "string"
},
"database": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Database"
},
"user": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "User"
},
"password": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Password"
},
"host": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Host"
},
"port": {
"default": 5432,
"title": "Port",
"type": "integer"
},
"secret_name": {
"title": "Secret Name",
"type": "string"
},
"secret_parameters": {
"anyOf": [
{
"additionalProperties": true,
"type": "object"
},
{
"type": "null"
}
],
"default": null,
"title": "Secret Parameters"
}
},
"required": [
"secret_name"
],
"title": "PostgresCatalogConfig",
"type": "object"
},
"R2StorageConfig": {
"description": "Cloudflare R2 storage backend for [DuckLake](https://ducklake.select/).\n\nUses the DuckDB [`TYPE R2`](https://duckdb.org/docs/stable/core_extensions/httpfs/s3api#r2-secrets) secret.",
"properties": {
"type": {
"const": "r2",
"default": "r2",
"title": "Type",
"type": "string"
},
"key_id": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Key Id"
},
"secret": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Secret"
},
"account_id": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Account Id"
},
"data_path": {
"title": "Data Path",
"type": "string"
},
"secret_name": {
"title": "Secret Name",
"type": "string"
},
"secret_parameters": {
"anyOf": [
{
"additionalProperties": true,
"type": "object"
},
{
"type": "null"
}
],
"default": null,
"title": "Secret Parameters"
}
},
"required": [
"data_path",
"secret_name"
],
"title": "R2StorageConfig",
"type": "object"
},
"S3StorageConfig": {
"description": "[S3 storage](https://duckdb.org/docs/stable/core_extensions/httpfs/s3api) backend for DuckLake.",
"properties": {
"type": {
"const": "s3",
"default": "s3",
"title": "Type",
"type": "string"
},
"key_id": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Key Id"
},
"secret": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Secret"
},
"endpoint": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Endpoint"
},
"bucket": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Bucket"
},
"prefix": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Prefix"
},
"region": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Region"
},
"url_style": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Url Style"
},
"use_ssl": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "null"
}
],
"default": null,
"title": "Use Ssl"
},
"scope": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Scope"
},
"data_path": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Data Path"
},
"secret_name": {
"title": "Secret Name",
"type": "string"
},
"secret_parameters": {
"anyOf": [
{
"additionalProperties": true,
"type": "object"
},
{
"type": "null"
}
],
"default": null,
"title": "Secret Parameters"
}
},
"required": [
"secret_name"
],
"title": "S3StorageConfig",
"type": "object"
},
"SQLiteCatalogConfig": {
"description": "SQLite file-based metadata backend for [DuckLake](https://ducklake.select/).",
"properties": {
"type": {
"const": "sqlite",
"default": "sqlite",
"title": "Type",
"type": "string"
},
"uri": {
"title": "Uri",
"type": "string"
}
},
"required": [
"uri"
],
"title": "SQLiteCatalogConfig",
"type": "object"
}
},
"description": "[DuckLake](https://ducklake.select/) attachment configuration for a DuckDB connection.",
"properties": {
"catalog": {
"description": "Metadata catalog backend (DuckDB, SQLite, PostgreSQL, or MotherDuck).",
"discriminator": {
"mapping": {
"duckdb": "#/$defs/DuckDBCatalogConfig",
"motherduck": "#/$defs/MotherDuckCatalogConfig",
"postgres": "#/$defs/PostgresCatalogConfig",
"sqlite": "#/$defs/SQLiteCatalogConfig"
},
"propertyName": "type"
},
"oneOf": [
{
"$ref": "#/$defs/DuckDBCatalogConfig"
},
{
"$ref": "#/$defs/SQLiteCatalogConfig"
},
{
"$ref": "#/$defs/PostgresCatalogConfig"
},
{
"$ref": "#/$defs/MotherDuckCatalogConfig"
}
],
"title": "Catalog"
},
"storage": {
"anyOf": [
{
"discriminator": {
"mapping": {
"gcs": "#/$defs/GCSStorageConfig",
"local": "#/$defs/LocalStorageConfig",
"r2": "#/$defs/R2StorageConfig",
"s3": "#/$defs/S3StorageConfig"
},
"propertyName": "type"
},
"oneOf": [
{
"$ref": "#/$defs/LocalStorageConfig"
},
{
"$ref": "#/$defs/S3StorageConfig"
},
{
"$ref": "#/$defs/R2StorageConfig"
},
{
"$ref": "#/$defs/GCSStorageConfig"
}
]
},
{
"type": "null"
}
],
"default": null,
"description": "Data storage backend (local filesystem, S3, R2, or GCS). Not required for MotherDuck.",
"title": "Storage"
},
"alias": {
"default": "ducklake",
"description": "DuckDB catalog alias for the attached DuckLake database.",
"title": "Alias",
"type": "string"
},
"attach_options": {
"additionalProperties": true,
"description": "Extra [DuckLake](https://ducklake.select/) ATTACH options (e.g., api_version, override_data_path).",
"title": "Attach Options",
"type": "object"
},
"data_inlining_row_limit": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"default": null,
"description": "Store inserts smaller than this row count directly in the metadata catalog instead of creating Parquet files.",
"title": "Data Inlining Row Limit"
}
},
"required": [
"catalog"
],
"title": "DuckLakeConfig",
"type": "object"
}
catalog
pydantic-field
¶
catalog: (
DuckDBCatalogConfig
| SQLiteCatalogConfig
| PostgresCatalogConfig
| MotherDuckCatalogConfig
)
storage
pydantic-field
¶
storage: (
LocalStorageConfig
| S3StorageConfig
| R2StorageConfig
| GCSStorageConfig
| None
) = None
Data storage backend (local filesystem, S3, R2, or GCS). Not required for MotherDuck.
alias
pydantic-field
¶
alias: str = 'ducklake'
DuckDB catalog alias for the attached DuckLake database.
attach_options
pydantic-field
¶
Extra DuckLake ATTACH options (e.g., api_version, override_data_path).
data_inlining_row_limit
pydantic-field
¶
data_inlining_row_limit: int | None = None
Store inserts smaller than this row count directly in the metadata catalog instead of creating Parquet files.
Catalog Backends¶
DuckDB file-based metadata backend for DuckLake.
Show JSON schema:
{
"description": "DuckDB file-based metadata backend for [DuckLake](https://ducklake.select/).",
"properties": {
"type": {
"const": "duckdb",
"default": "duckdb",
"title": "Type",
"type": "string"
},
"uri": {
"title": "Uri",
"type": "string"
}
},
"required": [
"uri"
],
"title": "DuckDBCatalogConfig",
"type": "object"
}
SQLite file-based metadata backend for DuckLake.
Show JSON schema:
{
"description": "SQLite file-based metadata backend for [DuckLake](https://ducklake.select/).",
"properties": {
"type": {
"const": "sqlite",
"default": "sqlite",
"title": "Type",
"type": "string"
},
"uri": {
"title": "Uri",
"type": "string"
}
},
"required": [
"uri"
],
"title": "SQLiteCatalogConfig",
"type": "object"
}
PostgreSQL metadata backend for DuckLake.
Show JSON schema:
{
"description": "PostgreSQL metadata backend for [DuckLake](https://ducklake.select/).",
"properties": {
"type": {
"const": "postgres",
"default": "postgres",
"title": "Type",
"type": "string"
},
"database": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Database"
},
"user": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "User"
},
"password": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Password"
},
"host": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Host"
},
"port": {
"default": 5432,
"title": "Port",
"type": "integer"
},
"secret_name": {
"title": "Secret Name",
"type": "string"
},
"secret_parameters": {
"anyOf": [
{
"additionalProperties": true,
"type": "object"
},
{
"type": "null"
}
],
"default": null,
"title": "Secret Parameters"
}
},
"required": [
"secret_name"
],
"title": "PostgresCatalogConfig",
"type": "object"
}
MotherDuck-managed metadata backend for DuckLake.
Show JSON schema:
{
"description": "[MotherDuck](https://motherduck.com/)-managed metadata backend for [DuckLake](https://ducklake.select/).",
"properties": {
"type": {
"const": "motherduck",
"default": "motherduck",
"title": "Type",
"type": "string"
},
"database": {
"title": "Database",
"type": "string"
},
"region": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "AWS region of the MotherDuck-managed S3 storage (e.g. 'eu-central-1').",
"title": "Region"
}
},
"required": [
"database"
],
"title": "MotherDuckCatalogConfig",
"type": "object"
}
region
pydantic-field
¶
region: str | None = None
AWS region of the MotherDuck-managed S3 storage (e.g. 'eu-central-1').
Storage Backends¶
S3 storage backend for DuckLake.
Show JSON schema:
{
"description": "[S3 storage](https://duckdb.org/docs/stable/core_extensions/httpfs/s3api) backend for DuckLake.",
"properties": {
"type": {
"const": "s3",
"default": "s3",
"title": "Type",
"type": "string"
},
"key_id": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Key Id"
},
"secret": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Secret"
},
"endpoint": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Endpoint"
},
"bucket": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Bucket"
},
"prefix": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Prefix"
},
"region": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Region"
},
"url_style": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Url Style"
},
"use_ssl": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "null"
}
],
"default": null,
"title": "Use Ssl"
},
"scope": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Scope"
},
"data_path": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Data Path"
},
"secret_name": {
"title": "Secret Name",
"type": "string"
},
"secret_parameters": {
"anyOf": [
{
"additionalProperties": true,
"type": "object"
},
{
"type": "null"
}
],
"default": null,
"title": "Secret Parameters"
}
},
"required": [
"secret_name"
],
"title": "S3StorageConfig",
"type": "object"
}
Cloudflare R2 storage backend for DuckLake.
Uses the DuckDB TYPE R2 secret.
Show JSON schema:
{
"description": "Cloudflare R2 storage backend for [DuckLake](https://ducklake.select/).\n\nUses the DuckDB [`TYPE R2`](https://duckdb.org/docs/stable/core_extensions/httpfs/s3api#r2-secrets) secret.",
"properties": {
"type": {
"const": "r2",
"default": "r2",
"title": "Type",
"type": "string"
},
"key_id": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Key Id"
},
"secret": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Secret"
},
"account_id": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Account Id"
},
"data_path": {
"title": "Data Path",
"type": "string"
},
"secret_name": {
"title": "Secret Name",
"type": "string"
},
"secret_parameters": {
"anyOf": [
{
"additionalProperties": true,
"type": "object"
},
{
"type": "null"
}
],
"default": null,
"title": "Secret Parameters"
}
},
"required": [
"data_path",
"secret_name"
],
"title": "R2StorageConfig",
"type": "object"
}
Google Cloud Storage backend for DuckLake.
Uses the DuckDB TYPE GCS secret.
Show JSON schema:
{
"description": "Google Cloud Storage backend for [DuckLake](https://ducklake.select/).\n\nUses the DuckDB [`TYPE GCS`](https://duckdb.org/docs/stable/core_extensions/httpfs/s3api#gcs-secrets) secret.",
"properties": {
"type": {
"const": "gcs",
"default": "gcs",
"title": "Type",
"type": "string"
},
"key_id": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Key Id"
},
"secret": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Secret"
},
"data_path": {
"title": "Data Path",
"type": "string"
},
"secret_name": {
"title": "Secret Name",
"type": "string"
},
"secret_parameters": {
"anyOf": [
{
"additionalProperties": true,
"type": "object"
},
{
"type": "null"
}
],
"default": null,
"title": "Secret Parameters"
}
},
"required": [
"data_path",
"secret_name"
],
"title": "GCSStorageConfig",
"type": "object"
}