Data API introduction
The Data API is the foundational vector API for Astra DB Serverless databases.
Overview
The Data API allows you to create AI applications that interact with Astra DB Serverless databases, including commands that perform vector searches with AI projections that return similarity scores. Data API also provides a diverse range of query and update operators that enable you to filter documents and sort response data.
The clients for Python, TypeScript, and Java are custom abstractions based on the underlying functionality provided by the Data API.
In addition to using those language-specific clients, you can submit Data API commands directly via any of the following methods:
-
curl
commands to send Data API requests to Astra DB Serverless databases, as detailed in Data API commands. -
The Data API Swagger UI, which includes "Try It Out" functionality. In a browser, open the Swagger UI by specifying your database’s API Endpoint, using this format:
\<ASTRA_DB_API_ENDPOINT>/api/json/swagger-ui/
When you create your Serverless (Vector) database in Astra Portal, the API Endpoint value is shown in the Database Details section. If you add more regions in the database, all API endpoints appear in the API Endpoints dialog.
In addition to the curl, Postman, and Swagger UI examples of Data API, see the API reference.
Prerequisites
The Data API examples assume the following:
-
You have an active Astra account.
-
You have created an Astra DB Serverless database in Astra Portal.
-
You have generated in application token in Astra Portal.
-
For your database, you copied the API Endpoint and auth token values from the Database Details section of Astra Portal. And you exported the values to environment variables in a CLI of your choosing:
-
ASTRA_DB_API_ENDPOINT
-
ASTRA_DB_APPLICATION_TOKEN
-
You can define a database namespace, which is also known as a keyspace, in the ASTRA_DB_KEYSPACE
environment variable. Recommendation: specify the name that’s already set for every Astra DB Serverless database: default_keyspace
.
To use the examples, also define a ASTRA_DB_COLLECTION
environment variable.
Data API naming conventions
Collection and property names must start and end with a letter or an underscore, and may only contain the following characters:
-
a-z
-
A-Z
-
0-9
-
_ (underscore)
-
- (hyphen)
Names must be between 1 and 48 characters.
The _id
property is reserved and interpreted as a document’s identity property.
The dollar sign $
is reserved for system-defined operator and property names. For example, $exists
, $and
, $or
, and $vector
.
Data API data types
Supported data types in Data API:
-
String
-
Number
-
Object (JSON object)
-
Array
-
Boolean
-
Vector (via
$vector
) -
Date (via
$date
) -
Null
-
UUID (via
$uuid
) -
ObjectId (via
$objectId
)
Data API limits
The Data API includes guardrails to ensure best practices, foster availability, and promote optimal configurations for your Astra DB Serverless databases.
Entity | Limit | Notes |
---|---|---|
Number of collections per database |
Five |
Up to five collections in an Serverless (Vector) database. |
Page size |
20 |
A page may contain up to 20 documents. After that per-page maximum is reached, you can load any additional documents on the next page via the |
Sort page size |
100 |
Document page size for sorting; implemented as separate from page size because sort operations need more rows per page. |
Maximum property name |
100 |
Maximum of 100 characters in a property name. |
Maximum path length |
1,000 |
Maximum of 1,000 characters in a path name; total for all segments, including any dots (.) between properties in a path. |
String property maximum bytes |
8,000 |
Maximum of 8,000 UTF-8 bytes for |
Number property maximum characters |
100 |
Maximum of 100 characters for |
Maximum elements per array |
1,000 |
Maximum number of elements in an array. This limit applies to indexed properties only. This limit is ignored for non-indexed properties. |
Maximum dimensions in vector-enabled collection |
4,096 |
Maximum size of dimensions you can define for a vector-enabled collection. |
Maximum number of properties per JSON object |
1,000 |
Maximum number of properties for a JSON object. This limit applies to indexed properties only. This limit is ignored for non-indexed properties. A given JSON object may have nested objects, also known as sub-documents. This maximum total count of 1,000 refers to all the indexed properties in the main document, plus a count of 1 for each sub-document (if any). |
Maximum number of properties per JSON document |
2,000 |
Maximum number of properties allowed in a single JSON document is 2,000. This limit includes intermediate properties as well as leaf properties. For example, given this document:
For the purposes of the limit, the document has three properties: |
Maximum document size in characters |
4 million |
Maximum size of each document in a collection is 4 million characters. |
Maximum inserted batch size in characters |
20 million |
Maximum size of an entire batch of documents submitted via an |
Maximum number of documents deleted per transaction |
20 |
Maximum number of documents that can be deleted in each transaction. |
Maximum number of documents updated per transaction |
20 |
Maximum number of documents that can be updated in each transaction. |
Maximum number of documents inserted per transaction |
20 |
Maximum number of documents that can be inserted in each transaction when using |
Maximum size |
100 |
Maximum size of an |
Maximum number of documents returned with each vector search |
1,000 |
Maximum number of documents returned with each vector search. |
If your code exceeds a limit, Data API still responds with an
The SUCCESS response would contain a message such as:
|
Data API operators
Data API provides a diverse range of logical and update operators that you can use in filters.
For examples in Data API request payloads, see Data API commands. Also see the Data API vector collection in Postman.
Operator type | Name | Purpose |
---|---|---|
Logical query |
|
Joins query clauses with a logical |
|
Joins query clauses with a logical |
|
|
Returns documents that do not match the conditions of the filter clause. |
|
Range query |
|
Matches documents where the given property is greater than the specified value. |
|
Matches documents where the given property is greater than or equal to the specified value. |
|
|
Matches documents where the given property is less than the specified value. |
|
|
Matches documents where the given property is less than or equal to the specified value. |
|
Comparison query |
|
Matches documents where the value of a property equals the specified value. This is the default when you do not specify an operator. |
|
Matches documents where the value of a property does not equal the specified value. |
|
|
Matches any of the values specified in the array. |
|
|
Matches any of the values that are NOT IN the array. |
|
Element query |
|
Matches documents that have the specified property. |
Array query |
|
Matches arrays that contain all elements in the specified array. |
|
Selects documents where the array has the specified number of elements. |
|
Property update |
|
Used in an update operation. In the following example, the
|
|
Increments the value of the property by the specified amount. |
|
|
Updates the property only if the specified value is less than the existing property value. |
|
|
Updates the property only if the specified value is greater than the existing property value. |
|
|
Multiply the value of a property in the document. Example:
|
|
|
Renames the specified property in each matching document. |
|
|
Sets the value of a property in each matching document. |
|
|
Set the value of a property in the document if an upsert is performed. Example:
|
|
|
Removes the specified property from each matching document. |
|
Array update |
|
Adds elements to the array only if they do not already exist in the set. |
|
Removes the first or last item of the array, depending on the value of the operator ( |
|
|
Adds or appends data to the end of the property value. Or, if the value is not yet an array: * If the property has no value, creates a one-element array (containing the item given). * If the property has a non-array value, creates a two-element array, with the old value as the first entry, and the specified item as the second entry. |
|
|
An array update that modifies the |
|
|
An array update that modifies the |
What’s next?
See the next topic for details about the Data API commands and submitting them via curl
.