Ibis Integration¶
Metaxy uses Ibis as a portable dataframe abstraction for SQL-based metadata stores. The IbisMetadataStore is the base class for all SQL-backed stores.
metaxy.metadata_store.ibis
¶
Ibis-based metadata store for SQL databases.
Supports any SQL database that Ibis supports: - DuckDB, PostgreSQL, MySQL (local/embedded) - ClickHouse, Snowflake, BigQuery (cloud analytical) - And 20+ other backends
metaxy.metadata_store.ibis.IbisMetadataStore
¶
IbisMetadataStore(
versioning_engine: VersioningEngineOptions = "auto",
connection_string: str | None = None,
*,
backend: str | None = None,
connection_params: dict[str, Any] | None = None,
table_prefix: str | None = None,
**kwargs: Any,
)
Bases: MetadataStore, ABC
Generic SQL metadata store using Ibis.
Supports any Ibis backend that supports struct types, such as: DuckDB, PostgreSQL, ClickHouse, and others.
Warning
Backends without native struct support (e.g., SQLite) are NOT supported.
Storage layout: - Each feature gets its own table: {feature}__{key} - System tables: metaxy__system__feature_versions, metaxy__system__migrations - Uses Ibis for cross-database compatibility
Note: Uses MD5 hash by default for cross-database compatibility. DuckDBMetadataStore overrides this with dynamic algorithm detection. For other backends, override the calculator instance variable with backend-specific implementations.
Example
# ClickHouse
store = IbisMetadataStore("clickhouse://user:pass@host:9000/db")
# PostgreSQL
store = IbisMetadataStore("postgresql://user:pass@host:5432/db")
# DuckDB (use DuckDBMetadataStore instead for better hash support)
store = IbisMetadataStore("duckdb:///metadata.db")
with store:
store.write(MyFeature, df)
Parameters:
-
versioning_engine(VersioningEngineOptions, default:'auto') βWhich versioning engine to use. - "auto": Prefer the store's native engine, fall back to Polars if needed - "native": Always use the store's native engine, raise
VersioningEngineMismatchErrorif provided dataframes are incompatible - "polars": Always use the Polars engine -
connection_string(str | None, default:None) βIbis connection string (e.g., "clickhouse://host:9000/db") If provided, backend and connection_params are ignored.
-
backend(str | None, default:None) βIbis backend name (e.g., "clickhouse", "postgres", "duckdb") Used with connection_params for more control.
-
connection_params(dict[str, Any] | None, default:None) βBackend-specific connection parameters e.g., {"host": "localhost", "port": 9000, "database": "default"}
-
table_prefix(str | None, default:None) βOptional prefix applied to all feature and system table names. Useful for logically separating environments (e.g., "prod_"). Must form a valid SQL identifier when combined with the generated table name.
-
**kwargs(Any, default:{}) βPassed to MetadataStore.init (e.g., fallback_stores, hash_algorithm)
Raises:
-
ValueErrorβIf neither connection_string nor backend is provided
-
ImportErrorβIf Ibis or required backend driver not installed
Example
Source code in src/metaxy/metadata_store/ibis.py
def __init__(
self,
versioning_engine: VersioningEngineOptions = "auto",
connection_string: str | None = None,
*,
backend: str | None = None,
connection_params: dict[str, Any] | None = None,
table_prefix: str | None = None,
**kwargs: Any,
):
"""
Initialize Ibis metadata store.
Args:
versioning_engine: Which versioning engine to use.
- "auto": Prefer the store's native engine, fall back to Polars if needed
- "native": Always use the store's native engine, raise `VersioningEngineMismatchError`
if provided dataframes are incompatible
- "polars": Always use the Polars engine
connection_string: Ibis connection string (e.g., "clickhouse://host:9000/db")
If provided, backend and connection_params are ignored.
backend: Ibis backend name (e.g., "clickhouse", "postgres", "duckdb")
Used with connection_params for more control.
connection_params: Backend-specific connection parameters
e.g., {"host": "localhost", "port": 9000, "database": "default"}
table_prefix: Optional prefix applied to all feature and system table names.
Useful for logically separating environments (e.g., "prod_"). Must form a valid SQL
identifier when combined with the generated table name.
**kwargs: Passed to MetadataStore.__init__ (e.g., fallback_stores, hash_algorithm)
Raises:
ValueError: If neither connection_string nor backend is provided
ImportError: If Ibis or required backend driver not installed
Example:
<!-- skip next -->
```py
# Using connection string
store = IbisMetadataStore("clickhouse://user:pass@host:9000/db")
# Using backend + params
store = IbisMetadataStore(backend="clickhouse", connection_params={"host": "localhost", "port": 9000})
```
"""
from ibis.backends.sql import SQLBackend
self.connection_string = connection_string
self.backend = backend
self.connection_params = connection_params or {}
self._conn: SQLBackend | None = None
self._table_prefix = table_prefix or ""
super().__init__(
**kwargs,
versioning_engine=versioning_engine,
)