What role does the model registry play in AI compliance?

It is the authoritative, version-controlled store for model artefacts and all compliance metadata, making the traceability chain from deployed model to training data, code, and pipeline execution navigable.

How does the composite version identifier link inference to provenance?

Every inference log entry includes a version tag injected by the serving infrastructure. From this tag, the full provenance chain (model, data, code, pipeline, evaluation) is one registry lookup away.

What does model stage management look like in practice?

Four stages (None, Staging, Production, Archived) with governed transitions requiring AI Governance Lead approval. Promotion to Production is restricted to CI/CD pipelines, with approval events logged as compliance evidence.

Why must production promotion be restricted to the CI/CD pipeline?

Manual promotion bypasses the automated validation gates (performance, fairness, security) that the pipeline enforces. For high-risk systems, every model reaching production must have passed all gates, and the pipeline provides the auditable evidence trail.

What is a model card and why is it required?

A model card is structured documentation of the model's intended use, performance characteristics, limitations, and ethical considerations. It provides the human-readable summary that governance leads and regulators need alongside the technical metadata.

Can a tracking spreadsheet replace a model registry for compliance?

Technically yes for a single system with infrequent updates, but the manual burden of following provenance chains, verifying hashes, and managing stage transitions grows rapidly. MLflow is open-source and eliminates these burdens at no licensing cost.

Why must production promotion be restricted to the CI/CD pipeline?

Manual promotion bypasses the automated validation gates (performance, fairness, security) that the pipeline enforces. For high-risk systems, every model reaching production must have passed all gates, and the pipeline provides the auditable evidence trail.

What is a model card and why is it required?

A model card is structured documentation of the model's intended use, performance characteristics, limitations, and ethical considerations. It provides the human-readable summary that governance leads and regulators need alongside the technical metadata.

Can a tracking spreadsheet replace a model registry for compliance?

Technically yes for a single system with infrequent updates, but the manual burden of following provenance chains, verifying hashes, and managing stage transitions grows rapidly. MLflow is open-source and eliminates these burdens at no licensing cost.

The model registry is the central repository from which production models are deployed and to which all compliance metadata is attached. This page covers the six capabilities a compliance-grade registry must support, the metadata required for AISDP traceability, stage management workflows, and how the composite version identifier links every inference to its full provenance chain.

Abstract

Read abstract

The model registry serves as the compliance backbone for high-risk AI systems, providing the central index that makes the traceability chain navigable. Without it, links between code, data, and model artefacts may exist in principle but are impractical to follow in practice. A compliance-grade registry must support six capabilities: immutable versioning with unique non-reusable identifiers, structured metadata attachment covering training data versions and fairness metrics, lineage tracking linking each version to specific data and code, stage management with governed transitions, access control restricting production promotion to CI/CD pipelines only, and ten-year long-term retrieval for archived models. Each entry carries a minimum metadata set including the serialised artefact with content hash, experiment tracking run ID, data version reference, Git commit SHA, pipeline execution ID, full evaluation results, and a model card. The composite version identifier embedded in every inference log entry connects production behaviour to this metadata through a single registry lookup. Stage management through four stages (experimental, staging, production, archived) enforces governance sign-off before any model reaches production. Provenance query tooling, whether OpenLineage with Marquez or custom scripts chaining Git, DVC, and MLflow, must produce results within minutes for incident response. MLflow, W&B, SageMaker, and Vertex AI registries provide core capabilities, with MLflow available as an open-source option at no licensing cost.

What capabilities must a compliance-grade model registry provide?

Engineering Approach

The model registry is the central repository for trained model artefacts, serving the same function for models that the code repository serves for source code: the authoritative, version-controlled store from which production models are deployed and to which all model-related metadata is attached.

Immutable versioning ensures each registered model version is assigned a unique, non-reusable identifier that cannot be changed or overwritten after registration. Metadata attachment requires each model version to carry structured metadata including the training dataset version, training code version, hyperparameters, validation metrics, fairness metrics, and approval status. Lineage tracking links each model version to the specific data version, code version, and pipeline execution that produced it, enabling end-to-end provenance tracing.

Stage management requires models to progress through defined stages, typically experimental, staging, production, and archived, with stage transitions requiring documented approval. Access control enforces that only the CI/CD pipeline can promote a model to production; manual promotion is prohibited for high-risk systems to prevent unvalidated models from reaching deployment. Long-term retrieval requires archived models to be retrievable for the full ten-year retention period under Article 18.

Each production model version carries AISDP-required metadata tags including the training data version as a DVC commit hash, the code commit SHA, the pipeline execution identifier, the model artefact integrity hash, the validation gate result, and the key metrics such as AUC-ROC and demographic parity ratio. These tags enable any stakeholder to trace from a deployed model back to the exact data, code, and evaluation that produced it.

What tooling options exist for the model registry?

Engineering Approach

MLflow Model Registry, Weights and Biases Model Registry, Amazon SageMaker Model Registry, and Vertex AI Model Registry all provide the core capabilities.

MLflow Model Registry, Weights and Biases Model Registry, Amazon SageMaker Model Registry, and Vertex AI Model Registry all provide the core capabilities. The choice should be informed by the organisation's existing infrastructure, the level of integration with the CI/CD pipeline, and the registry's support for immutable versioning and access control. Organisations that self-host their registry should ensure that the underlying storage, typically object storage backed by a metadata database, meets the durability and availability requirements for a compliance-critical artefact store.

aisdp Module 3 must document the model registry's role in the system's architecture, the versioning scheme, the stage management workflow, and the approval criteria for each stage transition. Module 2 must document how the registry integrates with the CI/CD pipeline and how model provenance is maintained. The registry's contents, specifically the metadata for each production model version, are themselves evidence artefacts for the evidence pack.

What is the procedural alternative without a model registry?

Compensating Controls

Without a model registry, model management reverts to manual file management with a tracking spreadsheet.

Without a model registry, model management reverts to manual file management with a tracking spreadsheet. A dedicated directory or cloud storage bucket with access controls is organised by model name and version. The model tracking spreadsheet includes columns for model name, version, storage location, content hash, training data version, code commit, training date, evaluation metrics, stage, approval evidence including approver name and date, and deployment date.

Stage transitions from staging to production require a signed approval entry by the AI Governance Lead, and no model artefact may be deleted. Archived models are moved to a separate archive directory. The Technical SME follows the provenance chain manually: looking up the model version in the spreadsheet, finding the code commit, finding the data version. This is manageable for a single system with infrequent model updates but becomes burdensome for multiple systems or frequent retraining. MLflow is open-source and free, making it the recommended first step beyond manual management.

The Model Registry: AISDP Metadata and Production Governance

Written by

What capabilities must a compliance-grade model registry provide?

What tooling options exist for the model registry?

What is the procedural alternative without a model registry?

Frequently Asked Questions

Related Pages

In This Section

Build compliance into your pipeline