Skip to content

Feature Graph

FeatureGraph is a global "God" object that holds all the features loaded by Metaxy via the feature discovery mechanism.

Users may interact with FeatureGraph when writing custom migrations, otherwise they are not exposed to it.

metaxy.FeatureGraph

FeatureGraph()
Source code in src/metaxy/models/feature.py
def __init__(self):
    # Primary storage: FeatureDefinition objects
    self.feature_definitions_by_key: dict[FeatureKey, FeatureDefinition] = {}

Attributes

metaxy.FeatureGraph.project_version property

project_version: str

Generate a project version for the current project's features.

Uses feature_definition_version (spec + schema only), excluding external features. The project is determined from MetaxyConfig.project if set, otherwise from the graph's single project (via the project property).

Raises:

  • RuntimeError –

    If MetaxyConfig.project is not set and the graph is empty or spans multiple projects.

metaxy.FeatureGraph.has_external_features property

has_external_features: bool

Check if any feature in the graph is an external feature.

metaxy.FeatureGraph.project property

project: str

The single project for all non-external features in this graph.

Returns the project name if all non-external features belong to a single project.

Raises:

  • RuntimeError –

    If the graph is empty or features span multiple projects.

Functions

metaxy.FeatureGraph.add_feature

add_feature(feature: type[BaseFeature]) -> None

Add a feature class to the graph.

Creates a FeatureDefinition from the class and delegates to add_feature_definition.

Parameters:

Raises:

  • ValueError –

    If a feature with a different import path but the same key is already registered or if duplicate column names would result from renaming operations

Source code in src/metaxy/models/feature.py
def add_feature(self, feature: type["BaseFeature"]) -> None:
    """Add a feature class to the graph.

    Creates a FeatureDefinition from the class and delegates to add_feature_definition.

    Args:
        feature: Feature class to register

    Raises:
        ValueError: If a feature with a different import path but the same key is already registered
                   or if duplicate column names would result from renaming operations
    """
    definition = FeatureDefinition.from_feature_class(feature)
    self.add_feature_definition(definition)

metaxy.FeatureGraph.add_feature_definition

add_feature_definition(
    definition: FeatureDefinition,
    on_conflict: Literal["raise", "ignore"] = "raise",
) -> None

Add a feature to the graph.

Interactions with External Features

Normal features take priority over external features with the same key.

Parameters:

  • definition (FeatureDefinition) –

    FeatureDefinition to register

  • on_conflict (Literal['raise', 'ignore'], default: 'raise' ) –

    What to do if a feature with the same key is already registered

Raises:

  • ValueError –

    If a non-external feature with a different import path but the same key is already registered and on_conflict is "raise"

Source code in src/metaxy/models/feature.py
def add_feature_definition(
    self, definition: FeatureDefinition, on_conflict: Literal["raise", "ignore"] = "raise"
) -> None:
    """Add a feature to the graph.

    !!! note "Interactions with External Features"

        Normal features take priority over external features with the same key.

    Args:
        definition: FeatureDefinition to register
        on_conflict: What to do if a feature with the same key is already registered

    Raises:
        ValueError: If a non-external feature with a different import path but
            the same key is already registered and `on_conflict` is `"raise"`
    """
    key = definition.key

    if key not in self.feature_definitions_by_key:
        self.feature_definitions_by_key[key] = definition
    elif definition.is_external and not self.feature_definitions_by_key[key].is_external:
        # External features never overwrite non-external features
        return
    elif not definition.is_external and self.feature_definitions_by_key[key].is_external:
        # Non-external features always replace external features
        # Note: version mismatch checking is done in load_feature_definitions,
        # not here, because we need the full graph context to compute
        # provenance-carrying versions.
        self.feature_definitions_by_key[key] = definition
    elif definition.feature_class_path == self.feature_definitions_by_key[key].feature_class_path:
        # Same class path - allow quiet replacement
        self.feature_definitions_by_key[key] = definition
    elif on_conflict == "ignore":
        # Conflict exists but we're ignoring - keep existing definition
        return
    elif definition.is_external:
        # Both external with different class paths - raise to be safe
        raise ValueError(f"External feature with key {key.to_string()} is already registered.")
    else:
        # Both non-external with different class paths
        raise ValueError(
            f"Feature with key {key.to_string()} already registered. "
            f"Existing: {self.feature_definitions_by_key[key].feature_class_path}, "
            f"New: {definition.feature_class_path}. "
            f"Each feature key must be unique within a graph."
        )

metaxy.FeatureGraph.get_feature_definition

get_feature_definition(
    key: CoercibleToFeatureKey,
) -> FeatureDefinition

Get a FeatureDefinition by its key.

This is the primary method for accessing feature information.

Parameters:

Returns:

Raises:

  • KeyError –

    If no feature with the given key is registered

Source code in src/metaxy/models/feature.py
def get_feature_definition(self, key: CoercibleToFeatureKey) -> FeatureDefinition:
    """Get a FeatureDefinition by its key.

    This is the primary method for accessing feature information.

    Args:
        key: Feature key to look up

    Returns:
        FeatureDefinition for the feature

    Raises:
        KeyError: If no feature with the given key is registered
    """
    validated_key = ValidatedFeatureKeyAdapter.validate_python(key)

    if validated_key not in self.feature_definitions_by_key:
        raise KeyError(
            f"No feature with key {validated_key.to_string()} found in graph. "
            f"Available keys: {[k.to_string() for k in self.feature_definitions_by_key.keys()]}"
        )
    return self.feature_definitions_by_key[validated_key]

metaxy.FeatureGraph.remove_feature

remove_feature(key: CoercibleToFeatureKey) -> None

Remove a feature from the graph.

Parameters:

  • key (CoercibleToFeatureKey) –

    Feature key to remove. Accepts types that can be converted into a feature key..

Raises:

  • KeyError –

    If no feature with the given key is registered

Source code in src/metaxy/models/feature.py
def remove_feature(self, key: CoercibleToFeatureKey) -> None:
    """Remove a feature from the graph.

    Args:
        key: Feature key to remove. Accepts types that can be converted into a feature key..

    Raises:
        KeyError: If no feature with the given key is registered
    """
    # Validate and coerce the key
    validated_key = ValidatedFeatureKeyAdapter.validate_python(key)

    if validated_key not in self.feature_definitions_by_key:
        raise KeyError(
            f"No feature with key {validated_key.to_string()} found in graph. "
            f"Available keys: {[k.to_string() for k in self.feature_definitions_by_key]}"
        )

    del self.feature_definitions_by_key[validated_key]

metaxy.FeatureGraph.list_features

list_features(
    projects: list[str] | str | None = None,
    *,
    only_current_project: bool = True,
) -> list[FeatureKey]

List all feature keys in the graph, optionally filtered by project(s).

By default, filters features by the current project (first part of feature key). This prevents operations from affecting features in other projects.

Parameters:

  • projects (list[str] | str | None, default: None ) –

    Project name(s) to filter by. Can be: - None: Use current project from MetaxyConfig (if only_current_project=True) - str: Single project name - list[str]: Multiple project names

  • only_current_project (bool, default: True ) –

    If True, filter by current/specified project(s). If False, return all features regardless of project.

Returns:

Example
# Get features for specific project
features = graph.list_features(projects="myproject")

# Get all features regardless of project
all_features = graph.list_features(only_current_project=False)
Source code in src/metaxy/models/feature.py
def list_features(
    self,
    projects: list[str] | str | None = None,
    *,
    only_current_project: bool = True,
) -> list[FeatureKey]:
    """List all feature keys in the graph, optionally filtered by project(s).

    By default, filters features by the current project (first part of feature key).
    This prevents operations from affecting features in other projects.

    Args:
        projects: Project name(s) to filter by. Can be:
            - None: Use current project from MetaxyConfig (if only_current_project=True)
            - str: Single project name
            - list[str]: Multiple project names
        only_current_project: If True, filter by current/specified project(s).
            If False, return all features regardless of project.

    Returns:
        List of feature keys

    Example:
        ```py
        # Get features for specific project
        features = graph.list_features(projects="myproject")

        # Get all features regardless of project
        all_features = graph.list_features(only_current_project=False)
        ```
    """
    if not only_current_project:
        # Return all features (both class-based and definition-only)
        return list(self.feature_definitions_by_key.keys())

    # Normalize projects to list
    project_list: list[str]
    if projects is None:
        # Try to get from config context
        try:
            from metaxy.config import MetaxyConfig

            config = MetaxyConfig.get()
            if config.project is None:
                # No project configured - return all features
                return list(self.feature_definitions_by_key.keys())
            project_list = [config.project]
        except RuntimeError:
            # Config not initialized - in tests or non-CLI usage
            # Return all features (can't determine project)
            return list(self.feature_definitions_by_key.keys())
    elif isinstance(projects, str):
        project_list = [projects]
    else:
        project_list = projects

    # Filter by project(s) using FeatureDefinition.project
    return [key for key, defn in self.feature_definitions_by_key.items() if defn.project in project_list]

metaxy.FeatureGraph.get_feature_plan

get_feature_plan(key: CoercibleToFeatureKey) -> FeaturePlan

Get a feature plan for a given feature key.

Parameters:

  • key (CoercibleToFeatureKey) –

    Feature key to get plan for. Accepts types that can be converted into a feature key.

Returns:

  • FeaturePlan –

    FeaturePlan instance with feature spec and dependencies.

Raises:

  • MetaxyMissingFeatureDependency –

    If any dependency is not in the graph.

Source code in src/metaxy/models/feature.py
def get_feature_plan(self, key: CoercibleToFeatureKey) -> FeaturePlan:
    """Get a feature plan for a given feature key.

    Args:
        key: Feature key to get plan for. Accepts types that can be converted into a feature key.

    Returns:
        FeaturePlan instance with feature spec and dependencies.

    Raises:
        MetaxyMissingFeatureDependency: If any dependency is not in the graph.
    """
    from metaxy.utils.exceptions import MetaxyMissingFeatureDependency

    validated_key = ValidatedFeatureKeyAdapter.validate_python(key)

    definition = self.feature_definitions_by_key[validated_key]
    spec = definition.spec

    # Check all dependencies are present and collect their specs
    dep_specs = []
    for dep in spec.deps or []:
        if dep.feature not in self.feature_definitions_by_key:
            raise MetaxyMissingFeatureDependency(
                f"Feature '{validated_key.to_string()}' depends on '{dep.feature.to_string()}' "
                f"which is not in the graph."
            )
        dep_specs.append(self.feature_definitions_by_key[dep.feature].spec)

    return FeaturePlan(
        feature=spec,
        deps=dep_specs or None,
        feature_deps=spec.deps,
    )

metaxy.FeatureGraph.get_feature_version_by_field

get_feature_version_by_field(
    key: CoercibleToFeatureKey,
) -> dict[str, str]

Computes the field provenance map for a feature.

Hash together field provenance entries with the feature code version.

Parameters:

  • key (CoercibleToFeatureKey) –

    Feature key to get field versions for. Accepts types that can be converted into a feature key..

Returns:

  • dict[str, str] –

    dict[str, str]: The provenance hash for each field in the feature plan. Keys are field names as strings.

Source code in src/metaxy/models/feature.py
def get_feature_version_by_field(self, key: CoercibleToFeatureKey) -> dict[str, str]:
    """Computes the field provenance map for a feature.

    Hash together field provenance entries with the feature code version.

    Args:
        key: Feature key to get field versions for. Accepts types that can be converted into a feature key..

    Returns:
        dict[str, str]: The provenance hash for each field in the feature plan.
            Keys are field names as strings.
    """
    # Validate and coerce the key
    validated_key = ValidatedFeatureKeyAdapter.validate_python(key)

    res = {}

    plan = self.get_feature_plan(validated_key)

    for k, v in plan.feature.fields_by_key.items():
        res[k.to_string()] = self.get_field_version(FQFieldKey(field=k, feature=validated_key))

    return res

metaxy.FeatureGraph.get_feature_version

get_feature_version(key: CoercibleToFeatureKey) -> str

Computes the feature version as a single string.

Parameters:

  • key (CoercibleToFeatureKey) –

    Feature key to get version for. Accepts types that can be converted into a feature key..

Returns:

  • str –

    Truncated SHA256 hash representing the feature version.

Source code in src/metaxy/models/feature.py
def get_feature_version(self, key: CoercibleToFeatureKey) -> str:
    """Computes the feature version as a single string.

    Args:
        key: Feature key to get version for. Accepts types that can be converted into a feature key..

    Returns:
        Truncated SHA256 hash representing the feature version.
    """
    # Validate and coerce the key
    validated_key = ValidatedFeatureKeyAdapter.validate_python(key)

    hasher = hashlib.sha256()
    provenance_by_field = self.get_feature_version_by_field(validated_key)
    for field_key in sorted(provenance_by_field):
        hasher.update(field_key.encode())
        hasher.update(provenance_by_field[field_key].encode())

    return truncate_hash(hasher.hexdigest())

metaxy.FeatureGraph.get_downstream_features

get_downstream_features(
    sources: Sequence[CoercibleToFeatureKey],
) -> list[FeatureKey]

Get all features downstream of sources, topologically sorted.

Performs a depth-first traversal of the dependency graph to find all features that transitively depend on any of the source features.

Parameters:

  • sources (Sequence[CoercibleToFeatureKey]) –

    List of source feature keys. Each element can be string, sequence, FeatureKey, or BaseFeature class.

Returns:

  • list[FeatureKey] –

    List of downstream feature keys in topological order (dependencies first).

  • list[FeatureKey] –

    Does not include the source features themselves.

Example
# Build a DAG: a -> b -> d, a -> c -> d
class FeatureA(mx.BaseFeature, spec=mx.FeatureSpec(key="a", id_columns=["id"])):
    id: str


class FeatureB(
    mx.BaseFeature, spec=mx.FeatureSpec(key="b", id_columns=["id"], deps=[mx.FeatureDep(feature=FeatureA)])
):
    id: str


class FeatureC(
    mx.BaseFeature, spec=mx.FeatureSpec(key="c", id_columns=["id"], deps=[mx.FeatureDep(feature=FeatureA)])
):
    id: str


class FeatureD(
    mx.BaseFeature,
    spec=mx.FeatureSpec(
        key="d", id_columns=["id"], deps=[mx.FeatureDep(feature=FeatureB), mx.FeatureDep(feature=FeatureC)]
    ),
):
    id: str


graph.get_downstream_features(["a"])
# [FeatureKey(['b']), FeatureKey(['c']), FeatureKey(['d'])]
Source code in src/metaxy/models/feature.py
def get_downstream_features(self, sources: Sequence[CoercibleToFeatureKey]) -> list[FeatureKey]:
    """Get all features downstream of sources, topologically sorted.

    Performs a depth-first traversal of the dependency graph to find all
    features that transitively depend on any of the source features.

    Args:
        sources: List of source feature keys. Each element can be string, sequence, FeatureKey, or BaseFeature class.

    Returns:
        List of downstream feature keys in topological order (dependencies first).
        Does not include the source features themselves.

    Example:
        ```py
        # Build a DAG: a -> b -> d, a -> c -> d
        class FeatureA(mx.BaseFeature, spec=mx.FeatureSpec(key="a", id_columns=["id"])):
            id: str


        class FeatureB(
            mx.BaseFeature, spec=mx.FeatureSpec(key="b", id_columns=["id"], deps=[mx.FeatureDep(feature=FeatureA)])
        ):
            id: str


        class FeatureC(
            mx.BaseFeature, spec=mx.FeatureSpec(key="c", id_columns=["id"], deps=[mx.FeatureDep(feature=FeatureA)])
        ):
            id: str


        class FeatureD(
            mx.BaseFeature,
            spec=mx.FeatureSpec(
                key="d", id_columns=["id"], deps=[mx.FeatureDep(feature=FeatureB), mx.FeatureDep(feature=FeatureC)]
            ),
        ):
            id: str


        graph.get_downstream_features(["a"])
        # [FeatureKey(['b']), FeatureKey(['c']), FeatureKey(['d'])]
        ```
    """
    # Validate and coerce the source keys
    validated_sources = ValidatedFeatureKeySequenceAdapter.validate_python(sources)

    source_set = set(validated_sources)
    visited = set()
    post_order = []
    source_set = set(sources)
    visited = set()
    post_order = []  # Reverse topological order

    def visit(key: FeatureKey):
        """DFS traversal."""
        if key in visited:
            return
        visited.add(key)

        # Find all features that depend on this one
        for feature_key, definition in self.feature_definitions_by_key.items():
            if definition.spec.deps:
                for dep in definition.spec.deps:
                    if dep.feature == key:
                        # This feature depends on 'key', so visit it
                        visit(feature_key)

        post_order.append(key)

    # Visit all sources
    for source in validated_sources:
        visit(source)

    # Remove sources from result, reverse to get topological order
    result = [k for k in reversed(post_order) if k not in source_set]
    return result

metaxy.FeatureGraph.topological_sort_features

topological_sort_features(
    feature_keys: Sequence[CoercibleToFeatureKey]
    | None = None,
    *,
    descending: bool = False,
) -> list[FeatureKey]

Sort feature keys in topological order.

Uses stable alphabetical ordering when multiple nodes are at the same level. This ensures deterministic output for diff comparisons and migrations.

Implemented using depth-first search with post-order traversal.

Parameters:

  • feature_keys (Sequence[CoercibleToFeatureKey] | None, default: None ) –

    List of feature keys to sort. Each element can be string, sequence, FeatureKey, or BaseFeature class. If None, sorts all features (both Feature classes and standalone specs) in the graph.

  • descending (bool, default: False ) –

    If False (default), dependencies appear before dependents. For a chain A -> B -> C, returns [A, B, C]. If True, dependents appear before dependencies. For a chain A -> B -> C, returns [C, B, A].

Returns:

  • list[FeatureKey] –

    List of feature keys sorted in topological order

Example
class VideoRaw(mx.BaseFeature, spec=mx.FeatureSpec(key="video/raw", id_columns=["id"])):
    id: str


class VideoScene(
    mx.BaseFeature,
    spec=mx.FeatureSpec(key="video/scene", id_columns=["id"], deps=[mx.FeatureDep(feature=VideoRaw)]),
):
    id: str


graph.topological_sort_features(["video/raw", "video/scene"])
# [FeatureKey(['video', 'raw']), FeatureKey(['video', 'scene'])]
Source code in src/metaxy/models/feature.py
def topological_sort_features(
    self,
    feature_keys: Sequence[CoercibleToFeatureKey] | None = None,
    *,
    descending: bool = False,
) -> list[FeatureKey]:
    """Sort feature keys in topological order.

    Uses stable alphabetical ordering when multiple nodes are at the same level.
    This ensures deterministic output for diff comparisons and migrations.

    Implemented using depth-first search with post-order traversal.

    Args:
        feature_keys: List of feature keys to sort. Each element can be string, sequence,
            FeatureKey, or BaseFeature class. If None, sorts all features
            (both Feature classes and standalone specs) in the graph.
        descending: If False (default), dependencies appear before dependents.
            For a chain A -> B -> C, returns [A, B, C].
            If True, dependents appear before dependencies.
            For a chain A -> B -> C, returns [C, B, A].

    Returns:
        List of feature keys sorted in topological order

    Example:
        ```py
        class VideoRaw(mx.BaseFeature, spec=mx.FeatureSpec(key="video/raw", id_columns=["id"])):
            id: str


        class VideoScene(
            mx.BaseFeature,
            spec=mx.FeatureSpec(key="video/scene", id_columns=["id"], deps=[mx.FeatureDep(feature=VideoRaw)]),
        ):
            id: str


        graph.topological_sort_features(["video/raw", "video/scene"])
        # [FeatureKey(['video', 'raw']), FeatureKey(['video', 'scene'])]
        ```
    """
    # Determine which features to sort
    if feature_keys is None:
        # Include all features
        keys_to_sort = set(self.feature_definitions_by_key.keys())
    else:
        # Validate and coerce the feature keys
        validated_keys = ValidatedFeatureKeySequenceAdapter.validate_python(feature_keys)
        keys_to_sort = set(validated_keys)

    visited = set()
    result = []  # Topological order (dependencies first)

    def visit(key: FeatureKey):
        """DFS visit with post-order traversal."""
        if key in visited or key not in keys_to_sort:
            return
        visited.add(key)

        # Get dependencies from feature definition
        definition = self.feature_definitions_by_key.get(key)
        if definition and definition.spec.deps:
            # Sort dependencies alphabetically for deterministic ordering
            sorted_deps = sorted(
                (dep.feature for dep in definition.spec.deps),
                key=lambda k: k.to_string().lower(),
            )
            for dep_key in sorted_deps:
                if dep_key in keys_to_sort:
                    visit(dep_key)

        # Add to result after visiting dependencies (post-order)
        result.append(key)

    # Visit all keys in sorted order for deterministic traversal
    for key in sorted(keys_to_sort, key=lambda k: k.to_string().lower()):
        visit(key)

    # Post-order DFS gives topological order (dependencies before dependents)
    if descending:
        return list(reversed(result))
    return result

metaxy.FeatureGraph.get_project_version

get_project_version(project: str) -> str

Generate a project version for features belonging to a specific project.

Uses feature_definition_version (spec + schema only), excluding external features. This makes the project version independent of external feature changes.

Parameters:

  • project (str) –

    The project name to compute version for.

Returns:

  • str –

    A hash representing the project's feature definitions.

Source code in src/metaxy/models/feature.py
def get_project_version(self, project: str) -> str:
    """Generate a project version for features belonging to a specific project.

    Uses feature_definition_version (spec + schema only), excluding external features.
    This makes the project version independent of external feature changes.

    Args:
        project: The project name to compute version for.

    Returns:
        A hash representing the project's feature definitions.
    """
    project_features = sorted(
        (
            (key, defn)
            for key, defn in self.feature_definitions_by_key.items()
            if defn.project == project and not defn.is_external
        ),
        key=lambda x: x[0],
    )
    return self._compute_project_version(project_features)

metaxy.FeatureGraph.to_snapshot

to_snapshot(
    *, project: str | None = None
) -> dict[str, SerializedFeature]

Serialize graph to snapshot format.

Returns a dict mapping feature_key (string) to feature data dict, including the import path of the Feature class for reconstruction.

External features are excluded from the snapshot as they should not be pushed to the metadata store.

Parameters:

  • project (str | None, default: None ) –

    Only include features from this project. If not provided, uses the graph's single project (via the project property).

Returns:

  • dict[str, SerializedFeature] –

    Dictionary mapping feature_key (string) to feature data dict.

Raises:

  • RuntimeError –

    If no project is provided and features span multiple projects.

Source code in src/metaxy/models/feature.py
def to_snapshot(self, *, project: str | None = None) -> dict[str, SerializedFeature]:
    """Serialize graph to snapshot format.

    Returns a dict mapping feature_key (string) to feature data dict,
    including the import path of the Feature class for reconstruction.

    External features are excluded from the snapshot as they should not be
    pushed to the metadata store.

    Args:
        project: Only include features from this project. If not provided,
            uses the graph's single project (via the `project` property).

    Returns:
        Dictionary mapping feature_key (string) to feature data dict.

    Raises:
        RuntimeError: If no project is provided and features span multiple projects.
    """
    if project is None:
        project = self.project

    snapshot: dict[str, SerializedFeature] = {}

    for feature_key, definition in self.feature_definitions_by_key.items():
        # Skip external features - they should not be pushed to the metadata store
        if definition.is_external:
            continue

        # Skip features from other projects
        if definition.project != project:
            continue

        feature_key_str = feature_key.to_string()
        feature_spec_dict = definition.spec.model_dump(mode="json")
        feature_schema_dict = definition.feature_schema
        feature_version = self.get_feature_version(feature_key)
        definition_version = definition.feature_definition_version
        project = definition.project
        class_path = definition.feature_class_path
        assert class_path is not None, "feature_class_path must be set for serialization"

        snapshot[feature_key_str] = {
            "feature_spec": feature_spec_dict,
            "feature_schema": feature_schema_dict,
            FEATURE_VERSION_COL: feature_version,
            FEATURE_TRACKING_VERSION_COL: definition_version,
            "feature_class_path": class_path,
            "project": project,
        }

    return snapshot

metaxy.FeatureGraph.from_snapshot classmethod

from_snapshot(
    snapshot_data: Mapping[str, Mapping[str, Any]],
) -> FeatureGraph

Reconstruct graph from snapshot by creating FeatureDefinition objects.

This method creates FeatureDefinition objects directly from the snapshot data without any dynamic imports. The resulting graph contains all feature metadata needed for operations like migrations and comparisons.

Parameters:

  • snapshot_data (Mapping[str, Mapping[str, Any]]) –

    Dict of feature_key -> dict containing all required fields: - feature_spec (dict): The feature specification - feature_schema (dict): The JSON schema for the feature - feature_class_path (str): The import path of the feature class - project (str): The project name

Returns:

  • FeatureGraph –

    New FeatureGraph with FeatureDefinition objects

Raises:

  • KeyError –

    If required fields are missing from snapshot data

Example
snapshot_data = {}  # Loaded from metadata store
# Load snapshot from metadata store
historical_graph = FeatureGraph.from_snapshot(snapshot_data)
Source code in src/metaxy/models/feature.py
@classmethod
def from_snapshot(
    cls,
    snapshot_data: Mapping[str, Mapping[str, Any]],
) -> "FeatureGraph":
    """Reconstruct graph from snapshot by creating FeatureDefinition objects.

    This method creates FeatureDefinition objects directly from the snapshot data
    without any dynamic imports. The resulting graph contains all feature metadata
    needed for operations like migrations and comparisons.

    Args:
        snapshot_data: Dict of feature_key -> dict containing all required fields:
            - feature_spec (dict): The feature specification
            - feature_schema (dict): The JSON schema for the feature
            - feature_class_path (str): The import path of the feature class
            - project (str): The project name

    Returns:
        New FeatureGraph with FeatureDefinition objects

    Raises:
        KeyError: If required fields are missing from snapshot data

    Example:
        ```py
        snapshot_data = {}  # Loaded from metadata store
        # Load snapshot from metadata store
        historical_graph = FeatureGraph.from_snapshot(snapshot_data)
        ```
    """
    graph = cls()

    required_fields = ("feature_spec", "feature_schema", "feature_class_path", "project")

    for feature_key_str, feature_data in snapshot_data.items():
        # Validate all required fields are present
        missing_fields = [f for f in required_fields if f not in feature_data]
        if missing_fields:
            raise KeyError(
                f"Feature '{feature_key_str}' snapshot is missing required fields: {missing_fields}. "
                f"All snapshots must include: {required_fields}"
            )

        definition = FeatureDefinition.from_stored_data(
            feature_spec=feature_data["feature_spec"],
            feature_schema=feature_data["feature_schema"],
            feature_class_path=feature_data["feature_class_path"],
            project=feature_data["project"],
            source="snapshot",
        )
        graph.add_feature_definition(definition)

    return graph

metaxy.FeatureGraph.get classmethod

get() -> FeatureGraph

Get the currently active graph.

Returns the graph from the context variable if set, otherwise returns the default global graph.

Returns:

Example
graph = mx.FeatureGraph.get_active()
Source code in src/metaxy/models/feature.py
@classmethod
def get(cls) -> "FeatureGraph":
    """Get the currently active graph.

    Returns the graph from the context variable if set, otherwise returns
    the default global graph.

    Returns:
        Active FeatureGraph instance

    Example:
        ```py
        graph = mx.FeatureGraph.get_active()
        ```
    """
    return _active_graph.get() or graph

metaxy.FeatureGraph.set_active classmethod

set_active(reg: FeatureGraph) -> None

Set the active graph for the current context.

This sets the context variable that will be returned by get_active(). Typically used in application setup code or test fixtures.

Parameters:

Example
my_graph = mx.FeatureGraph()
mx.FeatureGraph.set_active(my_graph)
mx.FeatureGraph.get_active()  # Returns my_graph
Source code in src/metaxy/models/feature.py
@classmethod
def set_active(cls, reg: "FeatureGraph") -> None:
    """Set the active graph for the current context.

    This sets the context variable that will be returned by get_active().
    Typically used in application setup code or test fixtures.

    Args:
        reg: FeatureGraph to activate

    Example:
        ```py
        my_graph = mx.FeatureGraph()
        mx.FeatureGraph.set_active(my_graph)
        mx.FeatureGraph.get_active()  # Returns my_graph
        ```
    """
    _active_graph.set(reg)

metaxy.FeatureGraph.use

use() -> Iterator[Self]

Context manager to temporarily use this graph as active.

This is the recommended way to use custom registries, especially in tests. The graph is automatically restored when the context exits.

Yields:

  • FeatureGraph ( Self ) –

    This graph instance

Example
with graph.use():

    class TestFeature(mx.BaseFeature, spec=mx.FeatureSpec(key="test", id_columns=["id"])):
        id: str
Source code in src/metaxy/models/feature.py
@contextmanager
def use(self) -> Iterator[Self]:
    """Context manager to temporarily use this graph as active.

    This is the recommended way to use custom registries, especially in tests.
    The graph is automatically restored when the context exits.

    Yields:
        FeatureGraph: This graph instance

    Example:
        ```py
        with graph.use():

            class TestFeature(mx.BaseFeature, spec=mx.FeatureSpec(key="test", id_columns=["id"])):
                id: str
        ```
    """
    token = _active_graph.set(self)
    try:
        yield self
    finally:
        _active_graph.reset(token)