Skip to main content

AI/ML / Multi Agent Refarch / Threats / DEV

Model profiling and system-prompt extraction

CCC.MARefArc.TH13

Crafted prompt sequences probe model internals to extract proprietary system prompts, configurations, or fine-tuning and RAG corpus content, enabling intellectual-property theft, model cloning, or follow-on attacks.

Related Capabilities

ID	Title	Description
CCC.MARefArc.CP16	Model-interaction zero-trust guardrails	Enforces authentication and authorization for every inference request and applies input validation against prompt injection, output filtering and redaction, access control, rate limits, and cost management before and after model execution.

Related Controls

ID	Title	Description
CCC.MARefArc.CN02	User, Application, and Model Firewalling	Establish enforced trust boundaries between the user, the application, and the models and tools by routing all traffic through the agent, LLM, and MCP gateways where guardrails inspect and constrain requests and responses.
CCC.MARefArc.CN10	AI Firewall Implementation and Management	Implement and operate an AI firewall within the guardrail components that inspects prompts, content, and responses for injection, sensitive data, and policy violations.
CCC.MARefArc.CN12	Tool Chain Validation and Sanitization	Validate tool selection, sanitize tool-call parameters, and constrain tool sequencing within the runtime and MCP guardrails to prevent manipulation of agent tool use.

External Mappings

Framework	ID	Remarks
air-vec	AIR-SEC-010-03