Skip to content

Metaxy + PostgreSQL

Experimental

This functionality is experimental.

Metadata managed by Metaxy can be stored in PostgreSQLMetadataStore. It uses PostgreSQL. This metadata store backend is limited in comparison to others, because PostgreSQL doesn't support map-like data types, and Metaxy's versioning engine can't run in the database. The local Polars versioning engine is used instead. This results in the following limitations for MetadataStore.resolve_update:

  • Increased I/O: entire upstream metadata has to be fetched to memory
  • Increased Memory footprint: expect high memory usage, especially when having many upstream features

Metaxy's Versioning Struct Columns

PostgreSQL doesn't have native map-like or struct types, so it's recommended to store Metaxy's versioning columns as JSONB. As a convenience feature, PostgreSQLMetadataStore will automatically json-encodes pl.Struct columns when writing metadata and parse them to pl.Struct when reading. This behavior can be disabled with auto_cast_struct_for_jsonb configuration parameter. This setting only affects user-defined columns, while Metaxy's versioning columns are always encoded/parsed.

API Reference

metaxy.ext.postgresql

PostgreSQL metadata store extension.

metaxy.ext.postgresql.PostgreSQLMetadataStore

PostgreSQLMetadataStore(
    connection_string: str | None = None,
    *,
    connection_params: dict[str, Any] | None = None,
    fallback_stores: list[MetadataStore] | None = None,
    auto_cast_struct_for_jsonb: bool = True,
    **kwargs: Any,
)

Bases: IbisMetadataStore

Experimental

This functionality is experimental.

PostgreSQL metadata store with storage/compute separation.

Uses PostgreSQL for storage and Polars for versioning. Filters push down to SQL WHERE clauses, then data materializes to Polars.

Example
from metaxy.ext.postgresql import PostgreSQLMetadataStore

store = PostgreSQLMetadataStore(connection_string="postgresql://user:pass@localhost:5432/metaxy")

with store:
    increment = store.resolve_update(MyFeature)
    store.write(MyFeature, increment.added)

Parameters:

  • connection_string (str | None, default: None ) –

    PostgreSQL connection URI (e.g., "postgresql://user:pass@host:5432/db")

  • connection_params (dict[str, Any] | None, default: None ) –

    Dict with keys: host, port, database, user, password (alternative to connection_string)

  • fallback_stores (list[MetadataStore] | None, default: None ) –

    List of fallback stores for chaining

  • auto_cast_struct_for_jsonb (bool, default: True ) –

    If True, JSON-encode all Struct columns to strings on write (Metaxy system Struct columns are always converted). The actual SQL column type (e.g., JSON, JSONB, or TEXT) is determined by the table schema, not this flag.

  • **kwargs (Any, default: {} ) –

    Additional arguments passed to IbisMetadataStore

Source code in src/metaxy/ext/postgresql/metadata_store.py
def __init__(
    self,
    connection_string: str | None = None,
    *,
    connection_params: dict[str, Any] | None = None,
    fallback_stores: list["MetadataStore"] | None = None,
    auto_cast_struct_for_jsonb: bool = True,
    **kwargs: Any,
):
    """Initialize PostgreSQL metadata store.

    Args:
        connection_string: PostgreSQL connection URI
            (e.g., "postgresql://user:pass@host:5432/db")
        connection_params: Dict with keys: host, port, database, user, password
            (alternative to connection_string)
        fallback_stores: List of fallback stores for chaining
        auto_cast_struct_for_jsonb: If True, JSON-encode all Struct columns to strings on write
            (Metaxy system Struct columns are always converted). The actual SQL column type
            (e.g., JSON, JSONB, or TEXT) is determined by the table schema, not this flag.
        **kwargs: Additional arguments passed to IbisMetadataStore
    """
    if connection_string is None and connection_params is None:
        raise ValueError("Must provide either connection_string or connection_params for PostgreSQL")

    self.auto_cast_struct_for_jsonb = auto_cast_struct_for_jsonb

    super().__init__(
        connection_string=connection_string,
        backend="postgres",
        connection_params=connection_params,
        fallback_stores=fallback_stores,
        **kwargs,
    )

Attributes

metaxy.ext.postgresql.PostgreSQLMetadataStore.materialization_id property
materialization_id: str | None

The external orchestration ID for this store instance.

If set, all metadata writes include this ID in the metaxy_materialization_id column, allowing filtering of rows written during a specific materialization run.

metaxy.ext.postgresql.PostgreSQLMetadataStore.name property
name: str | None

The configured name of this store, if any.

metaxy.ext.postgresql.PostgreSQLMetadataStore.qualified_class_name property
qualified_class_name: str

The fully qualified class name (module.classname).

metaxy.ext.postgresql.PostgreSQLMetadataStore.conn property
conn: SQLBackend

Get Ibis backend connection.

Returns:

  • SQLBackend

    Active Ibis backend connection

Raises:

metaxy.ext.postgresql.PostgreSQLMetadataStore.sqlalchemy_url property
sqlalchemy_url: str

Get SQLAlchemy-compatible connection URL for tools like Alembic.

Returns the connection string if available. If the store was initialized with backend + connection_params instead of a connection string, raises an error since constructing a proper URL is backend-specific.

Returns:

  • str

    SQLAlchemy-compatible URL string

Raises:

  • ValueError

    If connection_string is not available

Example
store = IbisMetadataStore("postgresql://user:pass@host:5432/db")
print(store.sqlalchemy_url)  # postgresql://user:pass@host:5432/db

Functions

metaxy.ext.postgresql.PostgreSQLMetadataStore.config_model classmethod

Return the config model for this metadata store.

Source code in src/metaxy/ext/postgresql/metadata_store.py
@classmethod
def config_model(cls) -> type[PostgreSQLMetadataStoreConfig]:
    """Return the config model for this metadata store."""
    return PostgreSQLMetadataStoreConfig
metaxy.ext.postgresql.PostgreSQLMetadataStore.native_implementation
native_implementation() -> Implementation

Force Polars implementation for versioning operations.

Source code in src/metaxy/ext/postgresql/metadata_store.py
def native_implementation(self) -> nw.Implementation:
    """Force Polars implementation for versioning operations."""
    return nw.Implementation.POLARS
metaxy.ext.postgresql.PostgreSQLMetadataStore.transform_before_write
transform_before_write(
    df: Frame, feature_key: FeatureKey, table_name: str
) -> Frame

Convert Struct columns to JSON strings using Polars.

Uses Polars' struct.json_encode() to serialize structs to JSON strings. Stored SQL type depends on target table schema (JSON/JSONB vs TEXT).

Source code in src/metaxy/ext/postgresql/metadata_store.py
def transform_before_write(self, df: Frame, feature_key: FeatureKey, table_name: str) -> Frame:
    """Convert Struct columns to JSON strings using Polars.

    Uses Polars' struct.json_encode() to serialize structs to JSON strings.
    Stored SQL type depends on target table schema (JSON/JSONB vs TEXT).
    """
    pl_df = collect_to_polars(df)
    return nw.from_native(self._encode_struct_columns(pl_df))
metaxy.ext.postgresql.PostgreSQLMetadataStore.transform_after_read
transform_after_read(
    table: Table, feature_key: FeatureKey
) -> Table

Cast JSONB columns to String so Polars can parse them back to Structs.

PostgreSQL JSONB columns appear as JSON type in Ibis. We cast to String so that when materialized to Polars, we can parse them back to Structs.

Source code in src/metaxy/ext/postgresql/metadata_store.py
def transform_after_read(self, table: "ibis.Table", feature_key: FeatureKey) -> "ibis.Table":
    """Cast JSONB columns to String so Polars can parse them back to Structs.

    PostgreSQL JSONB columns appear as JSON type in Ibis. We cast to String
    so that when materialized to Polars, we can parse them back to Structs.
    """
    schema = table.schema()
    json_columns = self._get_json_columns_for_struct(schema)

    if json_columns:
        mutations = {col_name: table[col_name].cast("string") for col_name in json_columns}
        return table.mutate(**mutations)
    return table

Configuration

Experimental

This functionality is experimental.

Configuration for PostgreSQLMetadataStore.

Inherits connection_string, connection_params, table_prefix, auto_create_tables from IbisMetadataStoreConfig.

Show JSON schema:
{
  "$defs": {
    "HashAlgorithm": {
      "description": "Supported hash algorithms for field provenance calculation.\n\nThese algorithms are chosen for:\n\n- Speed (non-cryptographic hashes preferred)\n\n- Cross-database availability\n\n- Good collision resistance for field provenance calculation",
      "enum": [
        "xxhash64",
        "xxhash32",
        "wyhash",
        "sha256",
        "md5",
        "farmhash"
      ],
      "title": "HashAlgorithm",
      "type": "string"
    }
  },
  "additionalProperties": false,
  "description": "Configuration for PostgreSQLMetadataStore.\n\nInherits connection_string, connection_params, table_prefix, auto_create_tables from IbisMetadataStoreConfig.",
  "properties": {
    "fallback_stores": {
      "description": "List of fallback store names to search when features are not found in the current store.",
      "items": {
        "type": "string"
      },
      "title": "Fallback Stores",
      "type": "array"
    },
    "hash_algorithm": {
      "anyOf": [
        {
          "$ref": "#/$defs/HashAlgorithm"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "description": "Hash algorithm for versioning. If None, uses store's default."
    },
    "versioning_engine": {
      "default": "auto",
      "description": "Which versioning engine to use: 'auto' (prefer native), 'native', or 'polars'.",
      "enum": [
        "auto",
        "native",
        "polars"
      ],
      "title": "Versioning Engine",
      "type": "string"
    },
    "connection_string": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "description": "Ibis connection string (e.g., 'clickhouse://host:9000/db').",
      "title": "Connection String"
    },
    "backend": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "description": "Ibis backend name (e.g., 'clickhouse', 'postgres', 'duckdb').",
      "mkdocs_metaxy_hide": true,
      "title": "Backend"
    },
    "connection_params": {
      "anyOf": [
        {
          "additionalProperties": true,
          "type": "object"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "description": "Backend-specific connection parameters.",
      "title": "Connection Params"
    },
    "table_prefix": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "description": "Optional prefix for all table names.",
      "title": "Table Prefix"
    },
    "auto_create_tables": {
      "anyOf": [
        {
          "type": "boolean"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "description": "If True, create tables on open. For development/testing only.",
      "title": "Auto Create Tables"
    },
    "auto_cast_struct_for_jsonb": {
      "default": true,
      "description": "Whether to encode/decode Struct columns to/from JSON on writes/reads. Metaxy system columns are always converted.",
      "title": "Auto Cast Struct For Jsonb",
      "type": "boolean"
    }
  },
  "title": "PostgreSQLMetadataStoreConfig",
  "type": "object"
}

fallback_stores pydantic-field

fallback_stores: list[str]

List of fallback store names to search when features are not found in the current store.

[stores.dev.config]
fallback_stores = []
[tool.metaxy.stores.dev.config]
fallback_stores = []
export METAXY_STORES__DEV__CONFIG__FALLBACK_STORES=[]

hash_algorithm pydantic-field

hash_algorithm: HashAlgorithm | None = None

Hash algorithm for versioning. If None, uses store's default.

[stores.dev.config]
hash_algorithm = "..."
[tool.metaxy.stores.dev.config]
hash_algorithm = "..."
export METAXY_STORES__DEV__CONFIG__HASH_ALGORITHM=...

versioning_engine pydantic-field

versioning_engine: Literal["auto", "native", "polars"] = (
    "auto"
)

Which versioning engine to use: 'auto' (prefer native), 'native', or 'polars'.

[stores.dev.config]
versioning_engine = "auto"
[tool.metaxy.stores.dev.config]
versioning_engine = "auto"
export METAXY_STORES__DEV__CONFIG__VERSIONING_ENGINE=auto

connection_string pydantic-field

connection_string: str | None = None

Ibis connection string (e.g., 'clickhouse://host:9000/db').

[stores.dev.config]
connection_string = "..."
[tool.metaxy.stores.dev.config]
connection_string = "..."
export METAXY_STORES__DEV__CONFIG__CONNECTION_STRING=...

connection_params pydantic-field

connection_params: dict[str, Any] | None = None

Backend-specific connection parameters.

[stores.dev.config]
connection_params = {}
[tool.metaxy.stores.dev.config]
connection_params = {}
export METAXY_STORES__DEV__CONFIG__CONNECTION_PARAMS=...

table_prefix pydantic-field

table_prefix: str | None = None

Optional prefix for all table names.

[stores.dev.config]
table_prefix = "..."
[tool.metaxy.stores.dev.config]
table_prefix = "..."
export METAXY_STORES__DEV__CONFIG__TABLE_PREFIX=...

auto_create_tables pydantic-field

auto_create_tables: bool | None = None

If True, create tables on open. For development/testing only.

[stores.dev.config]
auto_create_tables = false
[tool.metaxy.stores.dev.config]
auto_create_tables = false
export METAXY_STORES__DEV__CONFIG__AUTO_CREATE_TABLES=...

auto_cast_struct_for_jsonb pydantic-field

auto_cast_struct_for_jsonb: bool = True

Whether to encode/decode Struct columns to/from JSON on writes/reads. Metaxy system columns are always converted.

[stores.dev.config]
auto_cast_struct_for_jsonb = true
[tool.metaxy.stores.dev.config]
auto_cast_struct_for_jsonb = true
export METAXY_STORES__DEV__CONFIG__AUTO_CAST_STRUCT_FOR_JSONB=true