How should model inference endpoints be secured?

Every inference endpoint requires authentication, per-consumer rate limiting, input validation against strict schemas, output filtering, API versioning, and comprehensive logging of all requests and responses.

What data security controls are needed across ML pipelines?

Training data, feature stores, model artefacts, and inference logs each require proportionate access controls, encryption, audit logging, and integrity verification, with GDPR alignment where personal data is involved.

How should vector databases be protected against adversarial attacks?

Vector databases require separation of write and read access, encryption of embeddings that may constitute personal data, content provenance verification, and anomaly detection against adversarial document injection and bulk extraction.

What encryption standards should AI systems implement?

The engineering team should encrypt data at rest using AES-256 or equivalent and data in transit using TLS 1.3. Encryption keys should be managed through a dedicated key management service with rotation policies, access logging, and separation of duties.

How does zero trust differ from traditional perimeter security for AI?

Traditional perimeter security assumes trust within the network boundary. Zero trust assumes no implicit trust based on network location, requiring every service component to authenticate and authorise independently, with identity-based access replacing network-based trust.

What is adversarial document injection in vector databases?

An attacker who can insert documents into a knowledge base crafts documents designed to be retrieved for specific target queries, allowing indirect manipulation of LLM output. Controls include strict indexing access control, content provenance verification, and anomaly detection on retrieval patterns.

What encryption standards should AI systems implement?

The engineering team should encrypt data at rest using AES-256 or equivalent and data in transit using TLS 1.3. Encryption keys should be managed through a dedicated key management service with rotation policies, access logging, and separation of duties.

How does zero trust differ from traditional perimeter security for AI?

Traditional perimeter security assumes trust within the network boundary. Zero trust assumes no implicit trust based on network location, requiring every service component to authenticate and authorise independently, with identity-based access replacing network-based trust.

What is adversarial document injection in vector databases?

An attacker who can insert documents into a knowledge base crafts documents designed to be retrieved for specific target queries, allowing indirect manipulation of LLM output. Controls include strict indexing access control, content provenance verification, and anomaly detection on retrieval patterns.

Traditional Cybersecurity Foundations for AI Systems

Written by

Michael Clark

Chief Executive Officer, Standard Intelligence

Founder and CEO of Standard Intelligence. Author of the Practitioners Implementation Guide series for EU AI Act compliance.

Martin Dean

Chief Technology Officer, Standard Intelligence

CTO of Standard Intelligence. Leads platform engineering and contributes to the PIG series technical content.

AI systems must satisfy foundational cybersecurity requirements before addressing AI-specific threats. Article 15 of the EU AI Act requires high-risk AI systems to achieve an appropriate level of cybersecurity, building on established controls for network security, access management, encryption, and vulnerability management.

Abstract

Read abstract

Traditional cybersecurity foundations form the essential baseline for AI system security under the EU AI Act. High-risk AI systems must implement network segmentation within dedicated Virtual Private Clouds, enforce multi-factor authentication and role-based access control across all operator and administrative accounts, and encrypt data at rest using AES-256 and in transit using TLS 1.3. Zero trust architecture is particularly important for AI systems that span multiple trust boundaries, replacing perimeter-based security with identity-based access, microsegmentation, and continuous verification. Model inference endpoints require dedicated security controls including per-consumer authentication, calibrated rate limiting to prevent extraction and denial-of-service attacks, strict input validation, and output filtering. ML pipelines introduce data security requirements across training datasets, feature stores, model artefacts, and inference logs, each with distinct access control, encryption, and audit requirements. Vector databases used in retrieval-augmented generation systems face a novel attack surface through adversarial document injection and bulk extraction, requiring separation of write and read access, content provenance verification, and anomaly detection on retrieval patterns. These foundational controls feed into Module 9 cybersecurity documentation and support the broader risk management framework.

What foundational cybersecurity requirements apply to AI systems?

Regulatory Requirement

Before addressing AI-specific threats, the system must satisfy foundational cybersecurity requirements that apply to any software system handling sensitive data.

Before addressing AI-specific threats, the system must satisfy foundational cybersecurity requirements that apply to any software system handling sensitive data. The technical owner deploys the system's infrastructure within a dedicated Virtual Private Cloud with network segmentation isolating the AI system from other organisational systems. The security team restricts ingress and egress traffic to documented, necessary flows, and web application firewalls should protect any internet-facing endpoints. DDoS protection should be in place for public-facing services.

Multi-factor authentication should be mandatory for all operator accounts and all administrative access. role based access control enforces the principle of least privilege. Service-to-service authentication should use mutual TLS or equivalent. Access to model artefacts, training data, and configuration is restricted by the security team to authorised personnel and is auditable.

The engineering team encrypts data at rest using AES-256 or equivalent and encrypts data in transit using TLS 1.3. The security team manages encryption keys through a dedicated key management service with rotation policies, access logging, and separation of duties. These foundational measures form the baseline upon which AI-specific cybersecurity controls are layered, as covered in Cybersecurity for AI Systems.

What does vulnerability and patch management require?

Regulatory Requirement

Continuous vulnerability scanning of all system components is required, covering application code, dependencies, container images, and infrastructure.

Continuous vulnerability scanning of all system components is required, covering application code, dependencies, container images, and infrastructure. Critical and high-severity vulnerabilities should have documented remediation timelines: typically 72 hours for critical vulnerabilities and 30 days for high-severity ones. The security team retains vulnerability scan results and remediation records as evidence for Module 9.

Operating system, framework, and dependency patches must be applied by the engineering team on a documented schedule. Emergency patches for zero-day vulnerabilities should follow an expedited process. The engineering team tests patch application in the staging environment before production deployment. Together, vulnerability management and patch management create a continuous cycle of detection, prioritisation, remediation, and verification that supports the overall cybersecurity risk management posture.

Why is zero trust architecture essential for AI systems?

Engineering Approach

Traditional perimeter-based security models are insufficient for AI systems, which often span multiple trust boundaries: cloud infrastructure, on-premises data stores, third-party model APIs, deployer integration points, and human oversight interfaces. A zero trust architecture assumes no implicit trust based on network location and verifies every access request regardless of its origin.

For AI systems, zero trust principles translate into specific architectural decisions. Every service component, from data ingestion to model inference to post-processing, should authenticate and authorise independently. The model serving layer should not trust the data pipeline simply because both run within the same VPC. Each component should validate inputs against expected schemas, authenticate the calling service, and verify authorisation before processing.

Identity-based access replaces network-based trust. Service identities, managed through SPIFFE/SPIRE, cloud-native workload identity, or equivalent mechanisms, authenticate each microservice. Human identities flow through the entire request chain so that audit logs capture which operator's action triggered which model inference. Session tokens should carry the minimum claims necessary, and token lifetimes should be short enough to limit the window of compromise.

Microsegmentation at the workload level restricts lateral movement. Even if an attacker compromises the feature engineering service, they should not be able to reach the model artefact store, the training data repository, or the logging infrastructure. The security team defines network policies as allowlists, and any traffic not explicitly permitted is denied by default.

How should model inference endpoints be authenticated and rate-limited?

Engineering Approach

Every inference endpoint should require authentication, even for internal consumers.

Every inference endpoint should require authentication, even for internal consumers. API keys or OAuth tokens should identify each consumer, enabling per-consumer rate limiting and usage tracking. The technical sme calibrates rate limits to prevent model extraction attacks and denial-of-service attacks, with the thresholds documented in the ai system description package. Rate limiting operates at two levels: per-consumer limits that restrict individual usage patterns, and global limits that protect overall system capacity.

Per-consumer rate limiting, typically configured at the reverse proxy or API gateway layer, controls the number of inference requests a single consumer can make within a defined window. Global rate limiting caps total throughput across all consumers to prevent resource exhaustion. Burst allowances accommodate short spikes in legitimate traffic without compromising protection against sustained abuse. Request size limits prevent oversized adversarial inputs, and inference timeouts prevent denial-of-service through intentionally slow inputs. These controls work in concert with the broader Risk Management framework.

What input validation and output filtering do model endpoints require?

Engineering Approach

Model endpoints should validate inputs against a strict schema before they reach the model.

Model endpoints should validate inputs against a strict schema before they reach the model. Input dimensions, data types, value ranges, and content length are enforced by the serving infrastructure. For text inputs, injection pattern detection should filter known adversarial patterns. For image inputs, format validation, dimension checks, and anomaly detection on pixel distributions can identify adversarial perturbations.

Model outputs should pass through a filtering layer before reaching the consumer. For classification models, confidence scores below a minimum threshold should trigger a "low confidence" flag rather than a definitive classification. For generative models, output filters should detect and redact personally identifiable information, detect content that falls outside the system's intended purpose, and enforce output length limits.

The Technical SME tracks and documents model API versions. When a model is updated, the API version should change to prevent consumers from unknowingly receiving outputs from a different model version. The Technical Owner retires deprecated API versions on a documented schedule and notifies consumers in advance. The system logs every inference request and response with sufficient detail for forensic analysis, including the consumer identity, the input or a hash of the input for privacy-sensitive systems, the output, the model version, the inference latency, and any validation or filtering actions taken. These logs feed directly into the post-market monitoring system and the incident response process.

How should training data and feature stores be secured?

Engineering Approach

Training datasets often contain the most sensitive data in the ML pipeline.

Training datasets often contain the most sensitive data in the ML pipeline. The security team restricts access to authorised data engineers and model developers, with access logged and reviewed. The engineering team encrypts training data at rest and in transit. Where training data includes personal data, the encryption key management must align with GDPR retention and deletion requirements. Immutable audit logs should record every access to training data, enabling the organisation to demonstrate that data handling complied with the documented governance framework.

Feature stores aggregate and serve pre-computed features for model training and inference. They can become single points of compromise: an attacker who can modify feature values can influence model outputs without touching the model itself. Feature stores should enforce write access controls so that only authorised pipeline components can write features, integrity checks using checksums or cryptographic signatures on feature values, versioning so that every feature value change is recorded with a timestamp and provenance, and read access controls so that only authorised model serving components can read features. These data security requirements align with the broader data governance controls described in Data Governance.

What controls protect model artefacts and inference logs?

Compensating Controls

Trained model files are valuable intellectual property and a potential attack vector.

Trained model files are valuable intellectual property and a potential attack vector. The engineering team stores model artefacts in encrypted, access-controlled repositories with immutable versioning. Cryptographic signing of model artefacts enables the inference infrastructure to verify that the model loaded for production serving matches the model that passed the validation gates. The pipeline rejects any model artefact that fails signature verification, and the event should trigger a security alert.

Inference logs contain the system's production inputs and outputs, which may include personal data, commercially sensitive information, or data subject to legal privilege. The security team restricts log access to authorised monitoring and audit personnel. The engineering team encrypts logs at rest and retains them according to the documented retention policy. Where inference logs are used for model retraining, a common practice for continuous improvement, the data governance controls apply to the retraining dataset derived from those logs.

How should vector databases be secured against AI-specific attacks?

Engineering Approach

Systems that use retrieval-augmented generation, semantic search, or embedding-based matching store dense vector embeddings in specialised databases.

Systems that use retrieval-augmented generation, semantic search, or embedding-based matching store dense vector embeddings in specialised databases. These vector stores require the same security treatment as any database holding potentially sensitive data, along with controls for attack surfaces specific to vector retrieval.

Access control must enforce separation between write access, used during knowledge base indexing, and read access, used during inference-time retrieval. The indexing pipeline should authenticate as a dedicated service identity with write permissions. The inference service should authenticate as a separate identity with read-only permissions. Administrative operations such as index deletion, schema changes, and bulk exports should require elevated privileges and produce audit log entries.

Encryption at rest protects the stored embeddings. Embeddings derived from documents containing personal data may themselves constitute personal data under GDPR, which means the encryption, retention, and deletion requirements that apply to the source documents extend to the embeddings. The vector database's encryption configuration should be documented in the architecture security section of Module 9.

The vector database introduces a novel attack surface: adversarial document injection. An attacker who can insert documents into the knowledge base can craft documents designed to be retrieved for specific target queries, allowing indirect manipulation of the LLM's output. For example, in a customer-facing RAG system, an attacker could inject a document containing misleading product safety information that is semantically similar to common customer queries about that product. The document would be retrieved and presented to the LLM as authoritative context, potentially causing the system to generate harmful or incorrect responses.

Traditional Cybersecurity Foundations for AI Systems

Written by

What foundational cybersecurity requirements apply to AI systems?

What does vulnerability and patch management require?

Why is zero trust architecture essential for AI systems?

How should model inference endpoints be authenticated and rate-limited?

What input validation and output filtering do model endpoints require?

How should training data and feature stores be secured?

What controls protect model artefacts and inference logs?

How should vector databases be secured against AI-specific attacks?

Frequently Asked Questions

Related Pages

In This Section

Build compliance into your pipeline