Skip to main content

AI/ML / Multi Agent Refarch / Threats / DEV

Backdoor triggers and safety-mechanism disablement

CCC.MARefArc.TH21

Where weights are accessible, adversarial fine-tuning, engineered trigger phrases, or tampering disables alignment and content-moderation safeguards, causing targeted unsafe behaviour under specific conditions.

Related Capabilities

ID	Title	Description
CCC.MARefArc.CP16	Model-interaction zero-trust guardrails	Enforces authentication and authorization for every inference request and applies input validation against prompt injection, output filtering and redaction, access control, rate limits, and cost management before and after model execution.
CCC.MARefArc.CP14	Approved-model registry and lifecycle	Catalog of approved models with metadata, version information, configuration parameters, and usage constraints, ensuring agents access only models meeting organizational, regulatory, and security standards.

Related Controls

ID	Title	Description
CCC.MARefArc.CN05	Legal and Contractual Frameworks for AI Systems	Establish contractual controls with model and MCP service providers covering data handling, retention and deletion, intellectual property, liability, and supply-chain integrity.
CCC.MARefArc.CN08	Role-Based Access Control for AI Data	Enforce least-privilege, role-based access control over all AI data stores, including source bases, the vector store, and model artifacts.
CCC.MARefArc.CN13	MCP Server Security Governance	Govern the onboarding, verification, and ongoing monitoring of MCP servers so that only approved, integrity-verified servers are reachable, and supply-chain compromise is detected.

External Mappings

Framework	ID	Remarks
air-vec	AIR-SEC-008-04
air-vec	AIR-SEC-008-05