Node.js driver quickstart

DataStax recommends the TypeScript client for Serverless (Vector) databases. Use the Node.js driver only if you are working with an existing application that previously used a CQL-based driver or if you plan to explicitly use CQL.

Review the Connection methods comparison page to determine the option that best suits your use case.

This quickstart provides an end-to-end workflow for how to use the Node.js driver to connect to your database, load a set of vector embeddings, and perform a similarity search to find vectors that are close to the one in your query.

Prerequisites

You need the following items to complete this quickstart:

Install the cassandra-driver package

Install the cassandra-driver package using npm to connect your Node.js application to your database.

npm install cassandra-driver

Import libraries and connect to the database

Import the necessary libraries and establish a connection to your database.

const cassandra = require('cassandra-driver');

const cloud = { secureConnectBundle: process.env['ASTRA_DB_SECURE_BUNDLE_PATH'] };
const authProvider = new cassandra.auth.PlainTextAuthProvider('token', process.env['ASTRA_DB_APPLICATION_TOKEN']);
const client = new cassandra.Client({ cloud, authProvider });

async function run() {
    await client.connect();

    // ...
}

Create a table and vector-compatible Storage Attached Index (SAI)

Create a table named vector_test in the default_keyspace of your database, with an integer id as the primary key, a text field, and a 5-dimensional float vector. This example code also establishes a custom index on the vector column for efficient similarity searches using the dot product.

// ...

const keyspace = 'default_keyspace';
const v_dimension = 5;

await client.execute(`
  CREATE TABLE IF NOT EXISTS ${keyspace}.vector_test (id INT PRIMARY KEY,
  text TEXT, vector VECTOR<FLOAT,${v_dimension}>);
`);

await client.execute(`
    CREATE CUSTOM INDEX IF NOT EXISTS idx_vector_test
    ON ${keyspace}.vector_test
        (vector) USING 'StorageAttachedIndex' WITH OPTIONS =
        {'similarity_function' : 'cosine'};
`);

// ...

Load data

Insert a few documents with embeddings into the collection.

// ...

const text_blocks = [
    { id: 1, text: 'ChatGPT integrated sneakers that talk to you', vector: [0.1, 0.15, 0.3, 0.12, 0.05] },
    { id: 2, text: 'An AI quilt to help you sleep forever', vector: [0.45, 0.09, 0.01, 0.2, 0.11] },
    { id: 3, text: 'A deep learning display that controls your mood', vector: [0.1, 0.05, 0.08, 0.3, 0.6] },
];

for (let block of text_blocks) {
    const {id, text, vector} = block;
    await client.execute(
        `INSERT INTO ${keyspace}.vector_test (id, text, vector) VALUES (${id}, '${text}', [${vector}])`
    );
}

// ...

Find documents that are close to a specific vector embedding.

// ...

  const ann_query = `
    SELECT id, text, similarity_cosine(vector, [0.15, 0.1, 0.1, 0.35, 0.55]) as sim
    FROM ${keyspace}.vector_test
    ORDER BY vector ANN OF [0.15, 0.1, 0.1, 0.35, 0.55] LIMIT 2
  `;

  const result = await client.execute(ann_query);
  result.rows.forEach(row => {
    console.log(`[${row.id}] "${row.text}" (sim: ${row.sim.toFixed(4)})`);
  });

  await client.shutdown();
}

run().catch(console.error);

The Node.js driver is now connected to your database, a set of vector embeddings has been loaded, and a similarity search to find vectors that are close to the one in your query has been performed.

Resources

See the Node.js driver documentation for details about statements, connection pooling, node discovery, load balancing, retry policies, and other topics.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com