Skip to main content

AI/ML / Multi Agent Refarch / Controls / DEV

Quality of Service and DDoS Prevention

CCC.MARefArc.CN06 · PREV

Protect model and tool availability by enforcing quality-of-service controls, rate limits, and abuse and DDoS mitigation at the gateways.

Related Capabilities

ID	Title	Description
CCC.MARefArc.CP16	Model-interaction zero-trust guardrails	Enforces authentication and authorization for every inference request and applies input validation against prompt injection, output filtering and redaction, access control, rate limits, and cost management before and after model execution.
CCC.MARefArc.CP06	Agent collaboration and orchestration patterns	Supports supervisor/worker decomposition, skills-based routing, and agent-as-a-tool handoff for decomposing and executing complex tasks across multiple agents.
CCC.MARefArc.CP15	LLM inference gateway routing	Validates inference requests and routes each to the correct model instance, abstracting model hosting behind a consistent interface.
CCC.MARefArc.CP14	Approved-model registry and lifecycle	Catalog of approved models with metadata, version information, configuration parameters, and usage constraints, ensuring agents access only models meeting organizational, regulatory, and security standards.

Related Threats

ID	Title	Description
CCC.MARefArc.TH08	Denial of Wallet via token-expensive or unthrottled agentic calls	Token-expensive prompts, large-document chunking, or poorly throttled agentic loops drive excessive model and tool invocations, exhausting token budgets, triggering throttling, or inflating cost beyond capacity planning.
CCC.MARefArc.TH09	Technology service provider outage or degradation	Tight coupling to a specific external model provider with limited failover leaves the system exposed to provider outages or performance degradation under load, violating business-continuity expectations.
CCC.MARefArc.TH10	VRAM exhaustion on model-serving infrastructure	Configuration changes, aggressive caching, or memory leaks in model-serving libraries behind the LLM gateway exhaust GPU VRAM, degrading responsiveness or crashing model serving.

Assessment Requirements

ID	Text	Applicability
CCC.MARefArc.CN06.AR01	The LLM and MCP gateways MUST enforce per-consumer rate limits and quotas.	tlp-clear, tlp-green, tlp-amber, tlp-red
CCC.MARefArc.CN06.AR02	Gateways MUST apply DDoS and abuse detection and load shedding to preserve availability under load.	tlp-clear, tlp-green, tlp-amber, tlp-red

Guideline Mappings

Framework	ID	Remarks
finos-air	AIR-PREV-008