We use cookies to improve your experience and analyse site traffic.
The model registry is the central repository from which production models are deployed and to which all compliance metadata is attached. This page covers the six capabilities a compliance-grade registry must support, the metadata required for AISDP traceability, stage management workflows, and how the composite version identifier links every inference to its full provenance chain.
The model registry is the central repository for trained model artefacts, serving the same function for models that the code repository serves for source code: the authoritative, version-controlled store from which production models are deployed and to which all model-related metadata is attached.
The model registry is the central repository for trained model artefacts, serving the same function for models that the code repository serves for source code: the authoritative, version-controlled store from which production models are deployed and to which all model-related metadata is attached. A compliance-grade model registry must support six capabilities.
Immutable versioning ensures each registered model version is assigned a unique, non-reusable identifier that cannot be changed or overwritten after registration. Metadata attachment requires each model version to carry structured metadata including the training dataset version, training code version, hyperparameters, validation metrics, fairness metrics, and approval status. Lineage tracking links each model version to the specific data version, code version, and pipeline execution that produced it, enabling end-to-end provenance tracing.
Stage management requires models to progress through defined stages, typically experimental, staging, production, and archived, with stage transitions requiring documented approval. Access control enforces that only the CI/CD pipeline can promote a model to production; manual promotion is prohibited for high-risk systems to prevent unvalidated models from reaching deployment. Long-term retrieval requires archived models to be retrievable for the full ten-year retention period under Article 18.
Each production model version carries AISDP-required metadata tags including the training data version as a DVC commit hash, the code commit SHA, the pipeline execution identifier, the model artefact integrity hash, the validation gate result, and the key metrics such as AUC-ROC and demographic parity ratio. These tags enable any stakeholder to trace from a deployed model back to the exact data, code, and evaluation that produced it.
MLflow Model Registry, Weights and Biases Model Registry, Amazon SageMaker Model Registry, and Vertex AI Model Registry all provide the core capabilities.
MLflow Model Registry, Weights and Biases Model Registry, Amazon SageMaker Model Registry, and Vertex AI Model Registry all provide the core capabilities. The choice should be informed by the organisation's existing infrastructure, the level of integration with the CI/CD pipeline, and the registry's support for immutable versioning and access control. Organisations that self-host their registry should ensure that the underlying storage, typically object storage backed by a metadata database, meets the durability and availability requirements for a compliance-critical artefact store.
aisdp Module 3 must document the model registry's role in the system's architecture, the versioning scheme, the stage management workflow, and the approval criteria for each stage transition. Module 2 must document how the registry integrates with the CI/CD pipeline and how model provenance is maintained. The registry's contents, specifically the metadata for each production model version, are themselves evidence artefacts for the evidence pack.
Without a model registry, model management reverts to manual file management with a tracking spreadsheet.
Without a model registry, model management reverts to manual file management with a tracking spreadsheet. A dedicated directory or cloud storage bucket with access controls is organised by model name and version. The model tracking spreadsheet includes columns for model name, version, storage location, content hash, training data version, code commit, training date, evaluation metrics, stage, approval evidence including approver name and date, and deployment date.
Stage transitions from staging to production require a signed approval entry by the AI Governance Lead, and no model artefact may be deleted. Archived models are moved to a separate archive directory. The Technical SME follows the provenance chain manually: looking up the model version in the spreadsheet, finding the code commit, finding the data version. This is manageable for a single system with infrequent model updates but becomes burdensome for multiple systems or frequent retraining. MLflow is open-source and free, making it the recommended first step beyond manual management.
Manual promotion bypasses the automated validation gates (performance, fairness, security) that the pipeline enforces. For high-risk systems, every model reaching production must have passed all gates, and the pipeline provides the auditable evidence trail.
A model card is structured documentation of the model's intended use, performance characteristics, limitations, and ethical considerations. It provides the human-readable summary that governance leads and regulators need alongside the technical metadata.
Technically yes for a single system with infrequent updates, but the manual burden of following provenance chains, verifying hashes, and managing stage transitions grows rapidly. MLflow is open-source and eliminates these burdens at no licensing cost.
Four stages (None, Staging, Production, Archived) with governed transitions requiring AI Governance Lead approval. Promotion to Production is restricted to CI/CD pipelines, with approval events logged as compliance evidence.