Skip to main content

AI/ML / Multi Agent Refarch / Controls / DEV

Quality of Service and DDoS Prevention

CCC.MARefArc.CN06 · PREV

Protect model and tool availability by enforcing quality-of-service controls, rate limits, and abuse and DDoS mitigation at the gateways.

Related Capabilities

IDTitleDescription
CCC.MARefArc.CP16Model-interaction zero-trust guardrailsEnforces authentication and authorization for every inference request and applies input validation against prompt injection, output filtering and redaction, access control, rate limits, and cost management before and after model execution.
CCC.MARefArc.CP06Agent collaboration and orchestration patternsSupports supervisor/worker decomposition, skills-based routing, and agent-as-a-tool handoff for decomposing and executing complex tasks across multiple agents.
CCC.MARefArc.CP15LLM inference gateway routingValidates inference requests and routes each to the correct model instance, abstracting model hosting behind a consistent interface.
CCC.MARefArc.CP14Approved-model registry and lifecycleCatalog of approved models with metadata, version information, configuration parameters, and usage constraints, ensuring agents access only models meeting organizational, regulatory, and security standards.

Related Threats

IDTitleDescription
CCC.MARefArc.TH08Denial of Wallet via token-expensive or unthrottled agentic callsToken-expensive prompts, large-document chunking, or poorly throttled agentic loops drive excessive model and tool invocations, exhausting token budgets, triggering throttling, or inflating cost beyond capacity planning.
CCC.MARefArc.TH09Technology service provider outage or degradationTight coupling to a specific external model provider with limited failover leaves the system exposed to provider outages or performance degradation under load, violating business-continuity expectations.
CCC.MARefArc.TH10VRAM exhaustion on model-serving infrastructureConfiguration changes, aggressive caching, or memory leaks in model-serving libraries behind the LLM gateway exhaust GPU VRAM, degrading responsiveness or crashing model serving.

Assessment Requirements

IDTextApplicability
CCC.MARefArc.CN06.AR01The LLM and MCP gateways MUST enforce per-consumer rate limits and quotas.tlp-clear, tlp-green, tlp-amber, tlp-red
CCC.MARefArc.CN06.AR02Gateways MUST apply DDoS and abuse detection and load shedding to preserve availability under load.tlp-clear, tlp-green, tlp-amber, tlp-red

Guideline Mappings

FrameworkIDRemarks
finos-airAIR-PREV-008