CCC.GenAI.TH02 - Data Poisoning

CCC.GenAI.TH02: Data Poisoning

Threat ID:CCC.GenAI.TH02

Title:Data Poisoning

Description:

Data poisoning occurs when training, fine-tuning or embedding data is tampered with in order to modify the model's behaviour, for example steering it towards specific outputs, degrading performance or introducing backdoors.

Related Capabilities

ID	Title	Description
CCC.Core.F02	Encryption at Rest Enabled by Default	The service automatically encrypts all data using industry-standard cryptographic protocols prior to being written to a storage medium.
CCC.Core.F06	Access Control	The service automatically enforces user configurations to restrict or allow access to a specific component or a child resource based on factors such as user identities, roles, groups, or attributes.
CCC.GenAI.F03	Embedding Model Selection	Ability to select a foundation model used for tasks like semantic search, clustering, and document similarity by converting text into vector embeddings.
CCC.GenAI.F06	Customizable Model Selection	Provide users the ability to fine-tune models with their own data.
CCC.GenAI.F21	Generate Content	Ability to generate a response given a foundation model, parameter values, and a prompt.
CCC.GenAI.F22	Data Control	Ensures prompts, model outputs, embeddings, and training data fed by customers are not used to train foundation models.
CCC.GenAI.F24	Content Moderation	Ensure the service detects and filters abusive, harmful, and sensitive information to ensure responsible and safe use of the service.

External Mappings

Reference ID	Entry ID	Remarks
FINOS-AIGF	AIR-SEC-009	Data Poisoning
SAIF	DP	Data Poisoning
OWASP-LLM-TOP10	LLM04:2025	Data and Model Poisoning
MITRE-ATLAS	AML.T0020	Poison Training Data
MITRE-ATLAS	AML.T0070	RAG Poisoning

Controls

ID	Title	Objective	Control Family	Threat Mappings	Guideline Mappings	Assessment Requirements
CCC.GenAI.C03	Data Provenance and Source Vetting	Ensure that all data for training, fine-tuning or RAG comes from trusted, approved sources and is authorised for the intended purposes in order to prevent the initial introduction of malicious content or leaked sensitive data.	Data	2	3	2
CCC.GenAI.C04	Sanitisation of Ingested Data	Validate and sanitise all data ingested by GenAI systems from extenal sources or internal knowledge bases, whether for training, conversion to vector embeddings, or real-time retireval, in order to remove or redact poisoned or sensitive data before further processing.	Data	2	3	2
CCC.GenAI.C08	Quality Control and Red Teaming	Establish a formal program for quality evaluation and adversarial testing (red teaming) to ensure GenAI system meet all business, quality, security and compliance requirements before getting deployed into production environments.	Model Assurance and Evaluation	5	5	2