Skip to main content

Database / Vector / Controls / DEV

Validate Embeddings Before Indexing

CCC.Vector.CN01 · Ingestion

Ensure all incoming embeddings are structurally and statistically validated before indexing to prevent poisoning or corruption.

Related Capabilities

IDTitleDescription
CCC.Vector.CP02Vector IndexingProvides creation and management of indexes optimized for similarity search, such as HNSW, IVF, or PQ.
CCC.Vector.CP05Batch IngestionAllows for high-throughput batch upload and deletion of vectors and associated metadata.
CCC.Vector.CP07Index Lifecycle ManagementEnables automated or manual creation, optimization, and removal of vector indexes.
CCC.Vector.CP08Embedding Format CompatibilitySupports standard vector formats and integrates with common embedding generators (e.g., OpenAI, HuggingFace, TensorFlow).
CCC.Vector.CP09Vector Dimension ManagementSupports storing and managing vectors of specific or dynamic dimensionality, depending on model needs.
CCC.Core.CP04Transaction Rate LimitsThe service can throttle, delay, or reject excess requests when transactions exceed a user-specified rate limit, and always provides industry-standard throughput up to that limit.
CCC.Core.CP16BudgetingThe service may be configured to take a user-specified action when a spending threshold is met or exceeded on a child or networked resource.
CCC.Core.CP19Child Resource ScalingThe service may be configured to scale child resources automatically or on-demand.

Related Threats

IDTitleDescription
CCC.Vector.TH02Embedding and Index PoisoningAdversaries may insert malicious or adversarial vectors into the index through ingestion endpoints, polluting the dataset and degrading search quality, or subtly steering results toward specific outcomes.
CCC.Vector.TH05Embedding Format or Dimension AttacksPoor validation of embedding formats or dimensions can cause service crashes or logic errors. This can result in denial of service or incorrect similarity results.
CCC.Core.TH12Resource Constraints are ExhaustedExceeding the resource constraints through excessive consumption, resource-intensive operations, or lowering of rate-limit thresholds can impact the availability of elements such as memory, CPU, or storage. This may disrupt availability of the service or child resources by denying the associated functionality to users. If the impacted system is not designed to expect such a failure, the effect could also cascade to other services and resources.

Assessment Requirements

IDTextApplicability
CCC.Vector.CN01.AR01When a vector embedding is submitted for indexing, the system MUST validate that it matches expected schema, dimension, and format profiles.tlp-clear, tlp-green, tlp-amber, tlp-red

Guideline Mappings

FrameworkIDRemarks
FINOS-AIGFAIR-PREV-002Data Filtering From External Knowledge Bases