Query Vector Data with CQL
Cassandra Query Language (CQL) uses the vector
data type to enable vector similarity searches of your data.
Prerequisite
Your serverless database with Vector Search is ready to use for your content.
If you have not yet created a serverless Astra database with Vector Search, see creating an Astra serverless database with Vector Search. |
This section guides you in creating a table schema and an index, loading Vector Search data into your database, and using Cassandra Query Language (CQL) to work with that data.
Connect to your database using CQLSH
In your database with the Vector Search dashboard, select the CQL Console to open a CQLSH instance that is connected to your database.
Alternatively, you can connect to your database by downloading the standalone version of CQLSH and selecting the DataStax Astra with support for Vector Type version. For more, see connecting CQL console. |
Create vector schema
-
Select the keyspace you want to use for your Vector Search table. This example uses
vsearch
as thekeyspace name
:USE vsearch;
-
Create a new table in your keyspace, including the
item_vector
column for vector. The code below creates a vector with five values:CREATE TABLE IF NOT EXISTS vsearch.products ( id int PRIMARY KEY, name TEXT, description TEXT, item_vector VECTOR<FLOAT, 5> //create a 5-dimensional embedding );
-
Create the custom index with Storage Attached Indexing (SAI):
CREATE CUSTOM INDEX IF NOT EXISTS ann_index ON vsearch.products(item_vector) USING 'StorageAttachedIndex';
For more about SAI, see the Storage Attached Indexing documentation.
The index can be created with options that define the similarity function:
CREATE CUSTOM INDEX IF NOT EXISTS ann_index ON vsearch.products(item_vector) USING 'StorageAttachedIndex' WITH OPTIONS = { 'similarity_function': 'DOT_PRODUCT' };
Valid values for the
similarity_function
areCOSINE
(default),DOT_PRODUCT
, orEUCLIDEAN
.
Load Vector Search data into your database
Insert data into the table using the new type:
INSERT INTO vsearch.products (id, name, description, item_vector)
VALUES (
1, //id
'Coded Cleats', //name
'ChatGPT integrated sneakers that talk to you', //description
[0.1, 0.15, 0.3, 0.12, 0.05] //item_vector
);
INSERT INTO vsearch.products (id, name, description, item_vector)
VALUES (2, 'Logic Layers',
'An AI quilt to help you sleep forever',
[0.45, 0.09, 0.01, 0.2, 0.11]);
INSERT INTO vsearch.products (id, name, description, item_vector)
VALUES (5, 'Vision Vector Frame',
'A deep learning display that controls your mood',
[0.1, 0.05, 0.08, 0.3, 0.6]);
Query vector data with CQL
To query data using Vector Search, use a SELECT
query:
SELECT * FROM vsearch.products
ORDER BY item_vector ANN OF [0.15, 0.1, 0.1, 0.35, 0.55]
LIMIT 1;
To obtain the distance calculation of the best scoring node closest to the query data as part of the results, use a SELECT
query:
SELECT description, similarity_cosine(item_vector, [0.1, 0.15, 0.3, 0.12, 0.05])
FROM vsearch.products
ORDER BY item_vector ANN OF [0.1, 0.15, 0.3, 0.12, 0.05]
LIMIT 1;
The supported functions for this type of query are:
-
similarity_dot_product
-
similarity_cosine
-
similarity_euclidean
with the parameters of (<vector_column>, <embedding_value>). Both parameters represent vectors.
|
The embeddings were randomly generated in this quickstart. Generally, you would run both your source documents/contents through an embeddings generator, as well as the query you were asking to match. This example is simply to show the mechanics of how to use CQL to create vector search data objects.
With the code examples, you have a working example of our Vector Search. Load your own data and use the vector search functionality.