Skip to content

Ibis Integration

Metaxy uses Ibis as a portable dataframe abstraction for SQL-based metadata stores. The IbisMetadataStore is the base class for all SQL-backed stores.

metaxy.metadata_store.ibis

Ibis-based metadata store for SQL databases.

Supports any SQL database that Ibis supports: - DuckDB, PostgreSQL, MySQL (local/embedded) - ClickHouse, Snowflake, BigQuery (cloud analytical) - And 20+ other backends

metaxy.metadata_store.ibis.IbisMetadataStore

IbisMetadataStore(
    versioning_engine: VersioningEngineOptions = "auto",
    connection_string: str | None = None,
    *,
    backend: str | None = None,
    connection_params: dict[str, Any] | None = None,
    table_prefix: str | None = None,
    **kwargs: Any,
)

Bases: MetadataStore, ABC

Generic SQL metadata store using Ibis.

Supports any Ibis backend that supports struct types, such as: DuckDB, PostgreSQL, ClickHouse, and others.

Warning

Backends without native struct support (e.g., SQLite) are NOT supported.

Storage layout: - Each feature gets its own table: {feature}__{key} - System tables: metaxy__system__feature_versions, metaxy__system__migrations - Uses Ibis for cross-database compatibility

Note: Uses MD5 hash by default for cross-database compatibility. DuckDBMetadataStore overrides this with dynamic algorithm detection. For other backends, override the calculator instance variable with backend-specific implementations.

Example
# ClickHouse
store = IbisMetadataStore("clickhouse://user:pass@host:9000/db")

# PostgreSQL
store = IbisMetadataStore("postgresql://user:pass@host:5432/db")

# DuckDB (use DuckDBMetadataStore instead for better hash support)
store = IbisMetadataStore("duckdb:///metadata.db")

with store:
    store.write(MyFeature, df)

Parameters:

  • versioning_engine (VersioningEngineOptions, default: 'auto' ) –

    Which versioning engine to use. - "auto": Prefer the store's native engine, fall back to Polars if needed - "native": Always use the store's native engine, raise VersioningEngineMismatchError if provided dataframes are incompatible - "polars": Always use the Polars engine

  • connection_string (str | None, default: None ) –

    Ibis connection string (e.g., "clickhouse://host:9000/db") If provided, backend and connection_params are ignored.

  • backend (str | None, default: None ) –

    Ibis backend name (e.g., "clickhouse", "postgres", "duckdb") Used with connection_params for more control.

  • connection_params (dict[str, Any] | None, default: None ) –

    Backend-specific connection parameters e.g., {"host": "localhost", "port": 9000, "database": "default"}

  • table_prefix (str | None, default: None ) –

    Optional prefix applied to all feature and system table names. Useful for logically separating environments (e.g., "prod_"). Must form a valid SQL identifier when combined with the generated table name.

  • **kwargs (Any, default: {} ) –

    Passed to MetadataStore.init (e.g., fallback_stores, hash_algorithm)

Raises:

  • ValueError –

    If neither connection_string nor backend is provided

  • ImportError –

    If Ibis or required backend driver not installed

Example
# Using connection string
store = IbisMetadataStore("clickhouse://user:pass@host:9000/db")

# Using backend + params
store = IbisMetadataStore(backend="clickhouse", connection_params={"host": "localhost", "port": 9000})
Source code in src/metaxy/metadata_store/ibis.py
def __init__(
    self,
    versioning_engine: VersioningEngineOptions = "auto",
    connection_string: str | None = None,
    *,
    backend: str | None = None,
    connection_params: dict[str, Any] | None = None,
    table_prefix: str | None = None,
    **kwargs: Any,
):
    """
    Initialize Ibis metadata store.

    Args:
        versioning_engine: Which versioning engine to use.
            - "auto": Prefer the store's native engine, fall back to Polars if needed
            - "native": Always use the store's native engine, raise `VersioningEngineMismatchError`
                if provided dataframes are incompatible
            - "polars": Always use the Polars engine
        connection_string: Ibis connection string (e.g., "clickhouse://host:9000/db")
            If provided, backend and connection_params are ignored.
        backend: Ibis backend name (e.g., "clickhouse", "postgres", "duckdb")
            Used with connection_params for more control.
        connection_params: Backend-specific connection parameters
            e.g., {"host": "localhost", "port": 9000, "database": "default"}
        table_prefix: Optional prefix applied to all feature and system table names.
            Useful for logically separating environments (e.g., "prod_"). Must form a valid SQL
            identifier when combined with the generated table name.
        **kwargs: Passed to MetadataStore.__init__ (e.g., fallback_stores, hash_algorithm)

    Raises:
        ValueError: If neither connection_string nor backend is provided
        ImportError: If Ibis or required backend driver not installed

    Example:
        <!-- skip next -->
        ```py
        # Using connection string
        store = IbisMetadataStore("clickhouse://user:pass@host:9000/db")

        # Using backend + params
        store = IbisMetadataStore(backend="clickhouse", connection_params={"host": "localhost", "port": 9000})
        ```
    """
    from ibis.backends.sql import SQLBackend

    self.connection_string = connection_string
    self.backend = backend
    self.connection_params = connection_params or {}
    self._conn: SQLBackend | None = None
    self._table_prefix = table_prefix or ""

    super().__init__(
        **kwargs,
        versioning_engine=versioning_engine,
    )