The system generates offensive, misleading, or inappropriate outputs, or is manipulated into doing so, that are attributed to the organization, with reputational and regulatory impact when output filtering and human review are insufficient.
AI/ML / Multi Agent Refarch / Threats / DEV
Reputational harm from offensive or misleading outputs
CCC.MARefArc.TH15
Related Capabilities
| ID | Title | Description |
|---|---|---|
| CCC.MARefArc.CP16 | Model-interaction zero-trust guardrails | Enforces authentication and authorization for every inference request and applies input validation against prompt injection, output filtering and redaction, access control, rate limits, and cost management before and after model execution. |
| CCC.MARefArc.CP22 | Runtime protection | Monitors agent actions and model outputs during execution to detect unsafe, non-compliant, or anomalous behavior, enforcing constraints, blocking disallowed actions, or triggering escalation. |
| CCC.MARefArc.CP02 | Human-in-the-loop output review | Application-embedded controls that allow users to review, approve, or modify agent outputs before they are executed or shared. |
Related Controls
| ID | Title | Description |
|---|---|---|
| CCC.MARefArc.CN02 | User, Application, and Model Firewalling | Establish enforced trust boundaries between the user, the application, and the models and tools by routing all traffic through the agent, LLM, and MCP gateways where guardrails inspect and constrain requests and responses. |
| CCC.MARefArc.CN05 | Legal and Contractual Frameworks for AI Systems | Establish contractual controls with model and MCP service providers covering data handling, retention and deletion, intellectual property, liability, and supply-chain integrity. |
| CCC.MARefArc.CN10 | AI Firewall Implementation and Management | Implement and operate an AI firewall within the guardrail components that inspects prompts, content, and responses for injection, sensitive data, and policy violations. |
| CCC.MARefArc.CN19 | Human Feedback Loop for AI Systems | Capture human feedback on agent outputs through the Feedback Engine and Human Supervision capabilities and feed it into evaluation and improvement of agents and models. |
| CCC.MARefArc.CN20 | Citations and Source Traceability for AI-Generated Information | Attach citations and source traceability to AI-generated information so that outputs can be verified against retrieved sources and decisions can be explained. |
External Mappings
| Framework | ID | Remarks |
|---|---|---|
| air-vec | AIR-OP-020 |