CCC Managed Vector Store
A Managed Vector Store is a specialized data service designed to store, index, and retrieve high-dimensional vector embeddings, enabling similarity search and machine learning inference. These services are used in AI/ML pipelines for use cases such as semantic search, recommendation systems, and generative AI applications. Vector stores support operations like nearest neighbor search using approximate or exact methods and integrate with model-serving and ingestion pipelines. They are optimized for performance, scale, and integration with cloud-native tools.
Release Details
Version:
DEV
Assurance Level:
Release Manager:
DB
Development Build
Contributors
DT
Development Team
Change Log
- Development build - no formal changelog available
Capabilities
ID | Title | Description | Threat Mappings |
---|---|---|---|
CCC.Vector.F01 | Embedding Storage | Supports storage of high-dimensional vector embeddings derived from raw input data such as text, images, or audio. | 1 |
CCC.Vector.F02 | Vector Indexing | Provides creation and management of indexes optimized for similarity search, such as HNSW, IVF, or PQ. | 1 |
CCC.Vector.F03 | Similarity Search | Enables nearest-neighbor queries using a query embedding to return the most similar vectors from the store. | 1 |
CCC.Vector.F04 | Metadata Filtering | Supports structured filtering on metadata fields alongside vector similarity search queries. | 1 |
CCC.Vector.F05 | Batch Ingestion | Allows for high-throughput batch upload and deletion of vectors and associated metadata. | 1 |
CCC.Vector.F06 | Real-Time Querying | Provides low-latency response to vector similarity queries suitable for interactive applications. | 1 |
CCC.Vector.F07 | Index Lifecycle Management | Enables automated or manual creation, optimization, and removal of vector indexes. | 2 |
CCC.Vector.F08 | Embedding Format Compatibility | Supports standard vector formats and integrates with common embedding generators (e.g., OpenAI, HuggingFace, TensorFlow). | 1 |
CCC.Vector.F09 | Vector Dimension Management | Supports storing and managing vectors of specific or dynamic dimensionality, depending on model needs. | 1 |
CCC.Vector.F10 | Multi-modal Vector Support | Supports storing and searching across vectors derived from multiple modalities (e.g., text, image, audio). | 1 |
CCC.Vector.F11 | Query Access Control | Provides the ability to restrict who can run vector similarity or metadata filter queries, separate from data modification rights. | 1 |
CCC.Vector.F12 | Approximate or Exact Search Modes | Supports both approximate nearest neighbor (ANN) algorithms for speed and exact search modes for precision-critical applications. | 1 |
CCC.Core.F01 | Encryption in Transit Enabled by Default | The service automatically encrypts all data using industry-standard cryptographic protocols prior to transmission via a network interface. | 0 |
CCC.Core.F02 | Encryption at Rest Enabled by Default | The service automatically encrypts all data using industry-standard cryptographic protocols prior to being written to a storage medium. | 0 |
CCC.Core.F03 | Access Log Publication | The service automatically publishes structured, verbose records of activities performed within the scope of the service by external actors. | 3 |
CCC.Core.F04 | Transaction Rate Limits | The service can throttle, delay, or reject excess requests when transactions exceed a user-specified rate limit, and always provides industry-standard throughput up to that limit. | 1 |
CCC.Core.F06 | Access Control | The service automatically enforces user configurations to restrict or allow access to a specific component or a child resource based on factors such as user identities, roles, groups, or attributes. | 1 |
CCC.Core.F08 | Data Replication | The service automatically replicates data across multiple deployments simultaneously with parity, or may be configured to do so. | 2 |
CCC.Core.F09 | Metrics Publication | The service automatically publishes structured, numeric, time-series data points related to the performance, availability, and health of the service or its child resources. | 3 |
CCC.Core.F10 | Log Publication | The service automatically publishes structured, verbose records of activities, operations, or events that occur within the service. | 2 |
CCC.Core.F11 | Backup | The service can generate copies of its data or configurations in the form of automated backups, snapshot-based backups, or incremental backups. | 1 |
CCC.Core.F12 | Recovery | The service can be reverted to a previous state by providing a compatible backup or snapshot identifier. | 1 |
CCC.Core.F14 | API Access | The service exposes a port enabling external actors to interact programmatically with the service and its resources using HTTP protocol methods such as GET, POST, PUT, and DELETE. | 1 |
CCC.Core.F18 | Resource Versioning | The service automatically assigns versions to child resources which can be used to preserve, retrieve, and restore past iterations. | 2 |
CCC.Core.F20 | Resource Tagging | The service provides users with the ability to tag a child resource with metadata that can be reviewed or queried. | 1 |
CCC.Core.F21 | Resource Replication | The service may be configured to replicate child resources across multiple deployments. | 2 |
CCC.Core.F22 | Location Lock-In | The service may be configured to restrict the deployment of child resources to specific geographic locations. | 1 |
Threats
ID | Title | Description | External Mappings | Capability Mappings | Control Mappings |
---|---|---|---|---|---|
CCC.Vector.TH01 | Embedding Extraction and Model Inversion | Attackers may infer or reconstruct original data by probing vector similarity APIs, especially with unrestricted access. This enables model inversion attacks, membership inference, and unauthorized data leakage from stored embeddings. | 1 | 1 | 0 |
CCC.Vector.TH02 | Embedding and Index Poisoning | Adversaries may insert malicious or adversarial vectors into the index through ingestion endpoints, polluting the dataset and degrading search quality, or subtly steering results toward specific outcomes. | 1 | 1 | 3 |
CCC.Vector.TH03 | Cross-modal or Metadata Leakage | Attackers may infer sensitive information through metadata filters or by correlating embeddings across modalities (e.g., voice and face), bypassing surface-level access controls. | 1 | 1 | 1 |
CCC.Vector.TH04 | Index Corruption or Downgrade | Attackers with unauthorized access or excessive permissions may tamper with or roll back index versions, potentially restoring poisoned data or breaking downstream integrations. | 1 | 1 | 2 |
CCC.Vector.TH05 | Embedding Format or Dimension Attacks | Poor validation of embedding formats or dimensions can cause service crashes or logic errors. This can result in denial of service or incorrect similarity results. | 1 | 1 | 2 |
CCC.Vector.TH06 | Search Result Manipulation via ANN Bias | Approximate nearest neighbor (ANN) algorithms may yield non-deterministic or biased results. Adversaries may exploit these differences to evade detection or bias AI responses. | 2 | 1 | 1 |
CCC.Core.TH01 | Access is Granted to Unauthorized Users | Logic designed to give different permissions to different entities may be misconfigured or manipulated, allowing unauthorized entities to access restricted parts of the service, its data, or its child resources. This could result in a loss of data confidentiality or tolerance of unauthorized actions which impact the integrity and availability of resources and data. | 1 | 1 | 6 |
CCC.Core.TH02 | Data is Intercepted in Transit | Data transmitted by the service is susceptible to collection by any entity with access to any part of the transmission path. Packet observations can be used to support the planning of attacks by profiling origin points, destinations, and usage patterns. The data may also be vulnerable to interception or modification in transit if not properly encrypted, impacting the confidentiality or integrity of the transmitted data. | 1 | 1 | 1 |
CCC.Core.TH03 | Deployment Region Network is Untrusted | Systems are susceptible to unauthorized access or interception by actors with social or physical control over the network in which they are deployed. If the geopolitical status of the deployment network is untrusted, unstable, or insecure, this could result in a loss of confidentiality, integrity, or availability of the service and its data. | 1 | 1 | 1 |
CCC.Core.TH04 | Data is Replicated to Untrusted or External Locations | Systems are susceptible to unauthorized access or interception by actors with political or physical control over the network in which they are deployed. Confidentiality may be impacted if the data is replicated to a network where the geopolitical status is untrusted, unstable, or insecure. | 1 | 1 | 3 |
CCC.Core.TH05 | Interference with Replication Processes | Misconfigured or manipulated replication processes may lead to data being copied to unintended locations, delayed, modified, or not being copied at all. This could lead to compromised data confidentiality and integrity, potentially also affecting recovery processes and data availability. | 1 | 1 | 0 |
CCC.Core.TH06 | Data is Lost or Corrupted | Services that rely on accurate data are susceptible to disruption in the event of data loss or corruption. Any actions that lead to the unintended deletion, alteration, or limited access to data can impact the availability of the service and the system it is part of. | 1 | 1 | 1 |
CCC.Core.TH07 | Logs are Tampered With or Deleted | Tampering or deletion of service logs will reduce the system's ability to maintain an accurate record of events. Any actions that compromise the integrity of logs could disrupt system availability by disrupting monitoring, hindering forensic investigations, and reducing the accuracy of audit trails. | 1 | 1 | 1 |
CCC.Core.TH08 | Runtime Metrics are Manipulated | Manipulation of runtime metrics can lead to inaccurate representations of system performance and resource utilization. This compromised data integrity may also impact system availability through misinformed scaling decisions, budget exhaustion, financial losses, and hindered incident detection. | 1 | 1 | 0 |
CCC.Core.TH09 | Runtime Logs are Read by Unauthorized Entities | Unauthorized access to logs may expose valuable information about the system's configuration, operations, and security mechanisms. This could jeopardize system availability through the exposure of vulnerabilities and support the planning of attacks on the service, system, or network. If logs are not adequately sanitized, this may also directly impact the confidentiality of sensitive data. | 1 | 1 | 2 |
CCC.Core.TH10 | State-change Events are Read by Unauthorized Entities | Unauthorized access to state-change events can reveal information about the system's design and usage patterns. This opens the system up to attacks of opportunity and support the planning of attacks on the service, system, or network. | 1 | 1 | 0 |
CCC.Core.TH11 | Publications are Incorrectly Triggered | Incorrectly triggered publications may disseminate inaccurate or misleading information, creating a data integrity risk. Such misinformation can cause unintended operations to be initiated, conceal legitimate issues, and disrupt the availability or reliability of systems and their data. | 1 | 1 | 0 |
CCC.Core.TH12 | Resource Constraints are Exhausted | Exceeding the resource constraints through excessive consumption, resource-intensive operations, or lowering of rate-limit thresholds can impact the availability of elements such as memory, CPU, or storage. This may disrupt availability of the service or child resources by denying the associated functionality to users. If the impacted system is not designed to expect such a failure, the effect could also cascade to other services and resources. | 1 | 1 | 2 |
CCC.Core.TH13 | Resource Tags are Manipulated | When resource tags are altered, it can lead to misclassification or mismanagement of resources. This can reduce the efficacy of organizational policies, billing rules, or network access rules. Such changes could cause compromised confidentiality, integrity, or availability of the system and its data. | 1 | 1 | 0 |
CCC.Core.TH14 | Older Resource Versions are Used | Running older versions of child resources can expose the system to known vulnerabilities that have been addressed in more recent versions. If the version identifier is detected by an attacker, it may be possible to exploit these vulnerabilities to compromise the confidentiality, integrity, or availability of the system and its data. | 1 | 1 | 0 |
CCC.Core.TH15 | Automated Enumeration and Reconnaissance by Non-human Entities | Automated processes may be used to gather details about service and child resource elements such as APIs, file systems, or directories. This information can reveal vulnerabilities, misconfigurations, and the network topology, which can be used to plan an attack against the system, the service, or its child resources. | 1 | 1 | 1 |
CCC.Core.TH16 | Publications are Disabled | Publication of events, metrics, and runtime logs may be disabled, leading to a lack of expected security and operational information being shared. This can impact system availability by delaying the detection of incidents while also impacting system design decisions and enforcement of operational thresholds, such as autoscaling or cost management. | 1 | 1 | 0 |
CCC.Core.TH17 | Responses are Generated for Unauthorized Requests | The service may generate responses to requests from unauthorized entities. This could lead to the exposure of system details, which may be used to plan an attack against the service, system, or network. Additionally, allocating resources to service the request could lead to a denial of service for legitimate users, leading to a loss of availability anywhere in the system. | 1 | 1 | 0 |
Controls
ID | Title | Objective | Control Family | Threat Mappings | Guideline Mappings | Assessment Requirements |
---|---|---|---|---|---|---|
CCC.Vector.C01 | Validate Embeddings Before Indexing | Ensure all incoming embeddings are structurally and statistically validated before indexing to prevent poisoning or corruption. | Vector Indexing | 3 | 1 | 1 |
CCC.Vector.C02 | Enforce Role-Based Index Lifecycle Management | Restrict index lifecycle operations (create, delete, rollback) to privileged identities using fine-grained access controls. | Vector Indexing | 3 | 1 | 1 |
CCC.Vector.C03 | Enforce Metadata-Level Access Controls | Apply access control policies to metadata fields used in filtering to prevent unauthorized exposure or inference. | Vector Indexing | 2 | 3 | 1 |
CCC.Vector.C04 | Enforce Ingestion Quotas and Throttling | Prevent ingestion-based DoS or index pollution by rate-limiting vector submissions and enforcing quotas. | Vector Indexing | 2 | 1 | 1 |
CCC.Vector.C05 | Enforce Index Versioning with Rollback Protection | Ensure vector indexes are versioned and that rollback operations are authorized and auditable. | Vector Indexing | 3 | 1 | 1 |
CCC.Vector.C06 | Enforce Dimensional and Format Constraints | Reject embeddings that do not conform to expected model specifications (dimensions, format, etc). | Vector Indexing | 2 | 1 | 1 |
CCC.Vector.C07 | Support Explicit ANN vs. Exact Search Configuration | Provide clients with the option to enforce exact-match (non-ANN) search where search fidelity is critical. | Vector Indexing | 1 | 0 | 1 |
CCC.Core.C01 | Encrypt Data for Transmission | Ensure that all communications are encrypted in transit to protect data integrity and confidentiality. | Data | 1 | 8 | 5 |
CCC.Core.C02 | Encrypt Data for Storage | Ensure that all data stored is encrypted at rest using strong encryption algorithms. | Data | 1 | 7 | 1 |
CCC.Core.C06 | Restrict Deployments to Trust Perimeter | Ensure that the service and its child resources are only deployed on infrastructure in locations that are explicitly included within a defined trust perimeter. | Data | 1 | 4 | 2 |
CCC.Core.C09 | Ensure Integrity of Access Logs | Ensure that access logs are always recorded to an external location that cannot be manipulated from the context of the service(s) it contains logs for. | Data | 3 | 5 | 3 |
CCC.Core.C10 | Restrict Data Replication to Trust Perimeter | Ensure that data is only replicated on infrastructure in locations that are explicitly included within a defined trust perimeter. | Data | 1 | 4 | 1 |
CCC.Core.C03 | Implement Multi-factor Authentication (MFA) for Access | Ensure that all sensitive activities require two or more identity factors during authentication to prevent unauthorized access. | Identity and Access Management | 1 | 6 | 4 |
CCC.Core.C05 | Prevent Access from Untrusted Entities | Ensure that secure access controls enforce the principle of least privilege to restrict access to authorized entities from explicitly trusted sources only. | Identity and Access Management | 1 | 8 | 6 |
CCC.Core.C04 | Log All Access and Changes | Ensure that all access attempts are logged to maintain a detailed audit trail for security and compliance purposes. | Logging & Monitoring | 1 | 5 | 3 |
CCC.Core.C07 | Alert on Unusual Enumeration Activity | Ensure that logs and associated alerts are generated when unusual enumeration activity is detected that may indicate reconnaissance activities. | Logging & Monitoring | 1 | 4 | 2 |