Skip to main content

AI/ML / Multi Agent Refarch / Threats / DEV

Model profiling and system-prompt extraction

CCC.MARefArc.TH13

Crafted prompt sequences probe model internals to extract proprietary system prompts, configurations, or fine-tuning and RAG corpus content, enabling intellectual-property theft, model cloning, or follow-on attacks.

Related Capabilities

IDTitleDescription
CCC.MARefArc.CP16Model-interaction zero-trust guardrailsEnforces authentication and authorization for every inference request and applies input validation against prompt injection, output filtering and redaction, access control, rate limits, and cost management before and after model execution.

Related Controls

IDTitleDescription
CCC.MARefArc.CN02User, Application, and Model FirewallingEstablish enforced trust boundaries between the user, the application, and the models and tools by routing all traffic through the agent, LLM, and MCP gateways where guardrails inspect and constrain requests and responses.
CCC.MARefArc.CN10AI Firewall Implementation and ManagementImplement and operate an AI firewall within the guardrail components that inspects prompts, content, and responses for injection, sensitive data, and policy violations.
CCC.MARefArc.CN12Tool Chain Validation and SanitizationValidate tool selection, sanitize tool-call parameters, and constrain tool sequencing within the runtime and MCP guardrails to prevent manipulation of agent tool use.

External Mappings

FrameworkIDRemarks
air-vecAIR-SEC-010-03