• Glossary
  • Support
  • Downloads
  • DataStax Home
Get Live Help
Expand All
Collapse All

DataStax Astra DB Serverless Documentation

    • Overview
      • Release notes
      • Astra DB FAQs
      • Astra DB glossary
      • Get support
    • Getting Started
      • Grant a user access
      • Load and retrieve data
        • Use DSBulk to load data
        • Use Data Loader in Astra Portal
      • Connect a driver
      • Build sample apps
      • Use integrations
        • Connect with DataGrip
        • Connect with DBSchema
        • Connect with JanusGraph
        • Connect with Strapi
    • Planning
      • Plan options
      • Database regions
    • Securing
      • Security highlights
      • Security guidelines
      • Default user permissions
      • Change your password
      • Reset your password
      • Authentication and Authorization
      • Astra DB Plugin for HashiCorp Vault
    • Connecting
      • Connecting private endpoints
        • AWS Private Link
        • Azure Private Link
        • GCP Private Endpoints
        • Connecting custom DNS
      • Connecting Change Data Capture (CDC)
      • Connecting CQL console
      • Connect the Spark Cassandra Connector to Astra
      • Drivers for Astra DB
        • Connecting C++ driver
        • Connecting C# driver
        • Connecting Java driver
        • Connecting Node.js driver
        • Connecting Python driver
        • Drivers retry policies
      • Connecting Legacy drivers
      • Get Secure Connect Bundle
    • Migrating
      • FAQs
      • Preliminary steps
        • Feasibility checks
        • Deployment and infrastructure considerations
        • Create target environment for migration
        • Understand rollback options
      • Phase 1: Deploy ZDM Proxy and connect client applications
        • Set up the ZDM Automation with ZDM Utility
        • Deploy the ZDM Proxy and monitoring
          • Configure Transport Layer Security
        • Connect client applications to ZDM Proxy
        • Manage your ZDM Proxy instances
      • Phase 2: Migrate and validate data
      • Phase 3: Enable asynchronous dual reads
      • Phase 4: Change read routing to Target
      • Phase 5: Connect client applications directly to Target
      • Troubleshooting
        • Troubleshooting tips
        • Troubleshooting scenarios
      • Additional resources
        • Glossary
        • Contribution guidelines
        • Release Notes
    • Managing
      • Managing your organization
        • User permissions
        • Pricing and billing
        • Audit Logs
        • Bring Your Own Key
          • BYOK AWS Astra DB console
          • BYOK GCP Astra DB console
          • BYOK AWS DevOps API
          • BYOK GCP DevOps API
        • Configuring SSO
          • Configure SSO for Microsoft Azure AD
          • Configure SSO for Okta
          • Configure SSO for OneLogin
      • Managing your database
        • Create your database
        • View your databases
        • Database statuses
        • Use DSBulk to load data
        • Use Data Loader in Astra Portal
        • Monitor your databases
        • Export metrics to third party
          • Export metrics via Astra Portal
          • Export metrics via DevOps API
        • Manage access lists
        • Manage multiple keyspaces
        • Using multiple regions
        • Terminate your database
      • Managing with DevOps API
        • Managing database lifecycle
        • Managing roles
        • Managing users
        • Managing tokens
        • Managing BYOK AWS
        • Managing BYOK GCP
        • Managing access list
        • Managing multiple regions
        • Get private endpoints
        • AWS PrivateLink
        • Azure PrivateLink
        • GCP Private Service
    • Astra CLI
    • DataStax Astra Block
      • FAQs
      • About NFTs
      • DataStax Astra Block for Ethereum quickstart
    • Developing with Stargate APIs
      • Develop with REST
      • Develop with Document
      • Develop with GraphQL
        • Develop with GraphQL (CQL-first)
        • Develop with GraphQL (Schema-first)
      • Develop with gRPC
        • gRPC Rust client
        • gRPC Go client
        • gRPC Node.js client
        • gRPC Java client
      • Develop with CQL
      • Tooling Resources
      • Node.js Document API client
      • Node.js REST API client
    • Stargate QuickStarts
      • Document API QuickStart
      • REST API QuickStart
      • GraphQL API CQL-first QuickStart
    • API References
      • DevOps REST API v2
      • Stargate Document API v2
      • Stargate REST API v2
  • DataStax Astra DB Serverless Documentation
  • Migrating

Introduction to Zero Downtime Migration

Enterprises today depend on the ability to reliably migrate mission-critical client applications and data to cloud environments with zero downtime during the migration.

At DataStax, we’ve developed a set of thoroughly-tested self-service tools, automation scripts, examples, and documented procedures that walk you through well-defined migration phases.

We call this product suite DataStax Zero Downtime Migration (ZDM).

ZDM provides a simple and reliable way for you to migrate applications from any CQL-based cluster (Apache Cassandra®, DataStax Enterprise (DSE), Astra DB, or any type of CQL-based database) to any other CQL-based cluster, without any interruption of service to the client applications and data.

  • You can move your application to Astra DB, DSE, or Cassandra with no downtime and with minimal configuration changes.

  • Your clusters will be kept in sync at all times by a dual-write logic configuration.

  • You can rollback at any point, for complete peace of mind.

This suite of tools allows for zero downtime migration only if your database meets the minimum requirements. If your database does not meet these requirements, you can complete the migration from Origin to Target, but downtime might be necessary to finish the migration.

The Zero Downtime Migration process requires you to be able to perform rolling restarts of your client applications during the migration.

This is standard practice for client applications that are deployed over multiple instances and is a widely used approach to roll out releases and configuration changes.

Supported releases

Overall, you can use ZDM Proxy to migrate:

  • From: Any Cassandra 2.1.6 or higher release, or from any DSE 4.7.1 or higher release

  • To: Any higher release of Cassandra, or to any higher release of DSE, or to Astra DB

Migration scenarios

Here are just a few examples of migration scenarios that are supported when moving from one type of CQL-based database to another:

  • From an existing self-managed Cassandra or DSE cluster to cloud-native Astra DB. For example:

    • Cassandra 2.1.6+, 3.11.x, 4.0.x, or 4.1.x to Astra DB

    • DSE 4.7.1+, 4.8.x, 5.1.x, or 6.8.x to Astra DB

  • From an existing Cassandra or DSE cluster to another Cassandra or DSE cluster. For example:

    • Cassandra 2.1.6+ or 3.11.x to Cassandra 4.0.x or 4.1.x

    • DSE 4.7.1+, 4.8.x, or 5.1.x to DSE 6.8.x

    • Cassandra 2.1.6+, 3.11.x, 4.0.x, or 4.1.x to DSE 6.8.x

    • DSE 4.7.1+ or 4.8.x to Cassandra 4.0.x or 4.1.x

  • From Astra DB Classic to Astra DB Serverless

Migration phases

First, a couple of key terms used throughout the Zero Downtime Migration documentation and software components:

  • Origin: This cluster is your existing Cassandra-based environment, whether it’s open-source Apache Cassandra, DSE, or Astra DB Classic.

  • Target: This cluster is the new environment to which you want to migrate client applications and data with zero downtime.

For additional terms, see the glossary.

Your migration project occurs through a sequence of phases, with zero downtime.

Before your migration begins, you’ll need to satisfy prerequisites, prepare your environment, and set up the recommended infrastructure. Then, start in Phase 1 and progress through each phase in sequence:

Migration phases from start to finish

  • Phase 1: Deploy the ZDM Proxy and connect your client applications. This activates the dual-write logic: writes will be "bifurcated" (sent both to Origin and Target), while reads will be executed on Origin only.

  • Phase 2: Migrate existing data using Cassandra Data Migrator and/or DSBulk Migrator. Validate that the migrated data is correct, while continuing to perform dual writes.

  • Phase 3: Enable asynchronous dual reads (optional).

  • Phase 4: Change the proxy configuration to route reads to Target, effectively using Target as the source-of-truth while still keeping Origin in sync.

  • Phase 5: Move your client applications off the ZDM Proxy and connect them directly to Target.

Migration workflow

Here’s a diagram to illustrate the overall migration strategy when moving client applications and data from Origin to Target.

Migration workflow from client application to ZDM Proxy with dual writes to Origin and Target

  • With no changes required to your client application code itself, ZDM Proxy does the work to route writes to Origin and Target.

  • Cassandra Data Migrator and/or DSBulk Migrator can migrate data between clusters of any supported types. See the next section for an introduction to these tools.

  • Initially during the migration, ZDM Proxy always reads from Origin.

  • Once all the data has been imported into Target, you can run any validation and/or reconciliation on it. You can also optionally enable asynchronous reads to be sent to Target to try out the performance and validate that it can handle your application live request load before cutting over.

  • At this point, the read routing on the ZDM Proxy is switched to Target so that all reads are executed on it, while writes are still sent to both clusters. In other words, Target becomes the primary cluster.

  • Finally, the client application can be moved off the proxy and connected directly to Target, at which point the migration is complete.

DataStax Zero Downtime Migration components

  • The main component of the DataStax Zero Downtime Migration product suite is ZDM Proxy, which by design is a simple and lightweight proxy that handles all the real-time requests generated by your client applications. ZDM Proxy is open-source software (OSS) and available in its Public GitHub repo, https://github.com/datastax/zdm-proxy. You can view the source files and contribute code for potential inclusion via Pull Requests (PRs) initiated on a fork of the repo.

The ZDM Proxy itself doesn’t have any capability to migrate data or knowledge that a migration may be ongoing, and it is not coupled to the migration process in any way.

  • DataStax Zero Downtime Migration also provides the ZDM Utility and ZDM Automation to set up and run the Ansible playbooks that deploy and manage the ZDM Proxy and its monitoring stack.

  • Two data migration tools are available — Cassandra Data Migrator and DSBulk Migrator — to migrate your data. See the summary of features below.

Role of ZDM Proxy

We created ZDM Proxy to function between the application and both databases (Origin and Target). The databases can be any CQL-compatible data store (e.g. Apache Cassandra, DataStax Enterprise and Astra DB). The proxy always sends every write operation (Insert, Update, Delete) synchronously to both clusters at the desired Consistency Level:

  • If the write is successful in both clusters, it returns a successful acknowledgement to the client application

  • If the write fails on either cluster, the failure is passed back to the client application so that it can retry it as appropriate, based on its own retry policy.

This design ensures that new data is always written to both clusters, and that any failure on either cluster is always made visible to the client application. ZDM Proxy also sends all reads to the primary cluster (initially Origin, and later Target) and returns the result to the client application.

ZDM Proxy is designed to be highly available. It can be scaled horizontally, so typical deployments are made up of a minimum of 3 servers. ZDM Proxy can be restarted in a rolling fashion, for example, to change configuration for different phases of the migration.

ZDM Proxy has been designed to run in a clustered fashion so that it is never a single point of failure. Unless it is for a demo or local testing environment, a ZDM Proxy deployment should always comprise multiple ZDM Proxy instances.

We will often use the term ZDM Proxy to indicate the whole deployment, and ZDM Proxy instance to refer to an individual proxy process in the deployment.

Key features of ZDM Proxy

  • Allows you to lift-and-shift existing application code from Origin to Target with a simple change of a connection string.

  • Reduces risks to upgrades and migrations by decoupling Origin from Target, and allowing there to be an explicit cut-over point once you’re satisfied with Target.

  • Bifurcates writes synchronously to both clusters during the migration process.

  • Returns (for read operations) the response from the primary cluster, which is its designated source of truth. During a migration, Origin is typically the primary cluster. Near the end of the migration, you’ll shift the primary cluster to be Target.

  • Can be configured to also read asynchronously from Target. This capability is called Asynchronous Dual Reads (also known as Read Mirroring) and allows you to observe what read latencies and throughput Target can achieve under the actual production load.

    • Results from the asynchronous reads executed on Target are not sent back to the client application.

    • This design implies that failure on asynchronous reads from Target does not cause an error on the client application.

    • Asynchronous dual reads can be enabled and disabled dynamically with a rolling restart of the ZDM Proxy instances.

When using Asynchronous Dual Reads, any additional read load on Target may impact its ability to keep up with writes. This behavior is expected and desired. The idea is to mimic the full read and write load on Target so there are no surprises during the last migration phase; that is, after cutting over completely to Target.

ZDM Utility and ZDM Automation

Ansible is a suite of software tools that enables infrastructure as code. It is open source and its capabilities include software provisioning, configuration management, and application deployment functionality.

The Ansible automation for ZDM is organized into playbooks, each implementing a specific operation. The machine from which the playbooks are run is known as the Ansible Control Host. In ZDM, the Ansible Control Host will run as a Docker container.

You will use the ZDM Utility to set up Ansible in a Docker container, and ZDM Automation to run the Ansible playbooks from the Docker container created by ZDM Utility. In other words,the ZDM Utility creates the Docker container acting as the Ansible Control Host, from which the ZDM Automation allows you to deploy and manage the ZDM Proxy instances and the associated monitoring stack - Prometheus metrics and Grafana visualization of the metric data.

ZDM Utility and ZDM Automation expect that you have already provisioned the recommended infrastructure, as outlined in Deployment and infrastructure considerations.

The source for both of these tools are in a public repo.

For details, see:

  • Set up the ZDM Automation with ZDM Utility

  • Deploy the ZDM Proxy and monitoring

Data migration tools

As part of the overall migration process, you can use Cassandra Data Migrator and/or DSBulk Migrator to migrate your data.

Cassandra Data Migrator

Use Cassandra Data Migrator to:

  • Migrate your data from any CQL-supported Origin to any CQL-supported Target. Examples of databases that support CQL are Apache Cassandra, DataStax Enterprise and Astra DB.

  • Validate migration accuracy and performance using examples that provide a smaller, randomized data set

  • Preserve internal writetime timestamps and Time To Live (TTL) values

  • Take advantage of advanced data types (Sets, Lists, Maps, UDTs)

  • Filter records from the Origin data, using Cassandra’s internal writetime timestamp

  • Use SSL Support, including custom cipher algorithms

Cassandra Data Migrator is designed to:

  • Connect to and compare your Target database with Origin

  • Report differences in a detailed log file

  • Optionally reconcile any missing records and fix any data inconsistencies in Target, if you enable autocorrect in a config file

An important prerequisite is that you already have the matching schema on Target.

DSBulk Migrator

You can also take advantage of DSBulk Migrator to migrate smaller sets of data.

For more about both tools, see Migrate and validate data.

A fun way to learn: Zero Downtime Migration Interactive Lab

We’ve built a complementary learning resource that is a companion to this comprehensive ZDM documentation. It’s the Zero Downtime Migration Interactive Lab, available for you here:

https://www.datastax.com/dev/zdm

  • All you need is a browser and a GitHub account.

  • There’s nothing to install for the lab, which opens in a pre-configured GitPod environment.

  • You’ll learn about a full migration without leaving your browser!

To run the lab, all major browsers are supported, except Safari. For more, see the lab’s start page.

We encourage you to explore this free hands-on interactive lab from DataStax Academy. It’s an excellent, detailed view of the migration process. The lab describes and demonstrates all the steps and automation performed to prepare for, and complete, a migration from any Cassandra/DSE/Astra DB database to another Cassandra/DSE/Astra DB database across clusters.

The interactive lab spans the pre-migration prerequisites and each of the five key migration phases identified above.

Get Secure Connect Bundle FAQs

General Inquiries: +1 (650) 389-6000 info@datastax.com

© DataStax | Privacy policy | Terms of use

DataStax, Titan, and TitanDB are registered trademarks of DataStax, Inc. and its subsidiaries in the United States and/or other countries.

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.

Kubernetes is the registered trademark of the Linux Foundation.

landing_page landingpage