| CCC.MARefArc.TH06 | Foundation-model training and fine-tuning data poisoning | Adversaries tamper with training, fine-tuning, or third-party data feeds behind the approved models, mislabeling data or embedding backdoor triggers and biases that corrupt downstream decisions without visible symptoms until a major failure. |
| CCC.MARefArc.TH07 | Adaptive-learning and continuous-learning exploitation | The adaptive-learning capability that refines prompts and configurations from execution outcomes can be steered by an adversary who systematically feeds misleading signals, gradually skewing agent behaviour when validation of learning inputs is inadequate. |
| CCC.MARefArc.TH22 | Poor-quality, drifting, and bias-amplifying data | Inaccurate, incomplete, outdated, or biased grounding and training data lead to unreliable outputs, while data and concept drift erodes predictive power over time and amplifies historical errors at scale. |
| CCC.MARefArc.TH23 | Discriminatory outputs from bias | Biased training data, architectural and feature choices, proxy variables such as postal codes, and uncorrected feedback loops cause systematically discriminatory outcomes against protected groups, with legal and reputational exposure. |
| CCC.MARefArc.TH25 | Non-compliant outputs and model-risk-management gaps | AI-generated advice, marketing, or communications that fail KYC, suitability, disclosure, record-keeping, or model-risk-management expectations create regulatory exposure; weak supervision and accountability lines turn this into direct non-compliance. |
| CCC.MARefArc.TH26 | Intellectual-property leakage and licensing violations | Outputs may replicate copyrighted training material, employees may leak trade secrets into AI tools, and improper platform licensing or terms-of-service violations create contractual and legal liability. |
| CCC.MARefArc.TH01 | Model memorization leaks sensitive data across sessions | The hosted models accessed through the LLM layer may memorize sensitive inputs or training data and later disclose customer PII, proprietary algorithms, or trading strategies, including cross-user leakage into unrelated sessions. |
| CCC.MARefArc.TH02 | Hosted-provider data-handling exposure | Sensitive data submitted through the LLM gateway to third-party hosted models is exposed when the provider lacks transparent encryption, retention limits, or secure-deletion guarantees, leaving the institution without control over data it no longer holds. |
| CCC.MARefArc.TH03 | Embedding inversion and membership inference on the vector store | Vectors stored for semantic retrieval can be inverted to reconstruct original source text, or probed to infer whether specific confidential information is present, exposing PII or proprietary content held in the knowledge layer. |
| CCC.MARefArc.TH04 | Embedding-store poisoning degrades retrieved context | An actor with write access injects malicious or misleading embeddings into the vector store, degrading the accuracy of retrieved grounding context; the dense numerical representation makes the tampering hard to detect. |
| CCC.MARefArc.TH05 | Vector-store access-control, encryption, and audit gaps | Missing role-based access control, encryption at rest, or audit logging on the vector store allows unauthorized retrieval, modification, or undetected exfiltration of embeddings derived from sensitive internal data. |
| CCC.MARefArc.TH16 | Confident hallucination and fabricated facts | Lacking ground truth and faced with ambiguous prompts or helpfulness-biased tuning, the model fabricates plausible but false facts, figures, or citations, presented with high fluency that makes errors hard to catch and likely to be acted upon. |