Getting started with DataStax Enterprise 6.8
Information about using DataStax Enterprise for Administrators.
This topic provides basic information and a roadmap to documentation for System Administrators new to DataStax Enterprise.
Which product?
To help you choose which DataStax products best fit your requirements, see Products on the DataStax website. DataStax Enterprise (DSE) provides all the capabilities of Apache Cassandra® plus advanced functionality (detailed below).
Learn
Before diving into administration tasks, you can save a lot of time when setting up and operating DataStax Enterprise (DSE) in a production environment by learning a few basics first:
- Differences between Cassandra/DSE and relational databases
-
Cassandra and DSE databases are much different than relational databases and use a data model based on the types of queries, not on modeling entities and relationships. DataStax highly recommends taking 7 minutes to read Architecture in brief. It contains key concepts and terminology for understanding the database.
- DSE OpsCenter and Lifecycle Manager
-
DSE OpsCenter and Lifecycle Manager automate and simplify many administrative tasks.
- Learning resources
-
-
DataStax sample code and examples- Help for getting things done faster.
-
Katacoda scenarios - Distributed throughout DataStax docs to help you learn how to use Cassandra and DataStax products using real environments.
-
Learn menu - Available on every page, where you can quickly access other resources such as blogs and DataStax Academy.
-
Save yourself some time and frustration by spending a few moments looking at DataStax Doc and Search tips. These short topics talk about navigation and bookmarking aids that will make your journey through the docs more efficient and productive. |
The following are not administrator specific but are presented to give you a fuller picture of the database:
-
Cassandra Query Language (CQL) is the query language for DataStax Enterprise.
-
DataStax provides drivers in several programming languages for connecting client applications to the database.
-
APIs are available to interface with OpsCenter, DseGraphFrame, DataStax Spark Cassandra Connector, and the drivers.
Plan
The Planning and testing guide contains guidelines for capacity planning and hardware selection in production environments. Key topics include:
Install
DataStax offers a variety of ways to set up a cluster:
- Cloud
- On premises
-
-
Deployment per workload type For help with choosing an install type, see Which install method should I use?
Secure
DSE Advanced Security provides fine-grained user and access controls to keep applications data protected and compliant with regulatory standards like PCI, SOX, HIPAA, and the European Union’s General Data Protection Regulation (GDPR). Key topics include:
The DSE database includes the default role <cassandra> with password <cassandra>. This is a superuser login has full access to the database. DataStax recommends only using the cassandra role once during initial Role Based Access Control (RBAC) set up to establish your own root account and then disabling the cassandra role. See Adding a superuser login. |
Tune
Important topics for optimizing the performance of the database include:
-
Enable the Nodesync service (continuous background repair)
-
Load test your cluster before deployment
Operations
The most commonly used operations include:
Load
The primary tools for getting data into and out of the database are:
For other methods, see Migrating data to DataStax Enterprise.
Monitor
DataStax provides the following tools to monitor clusters and view metrics:
Troubleshooting/Help
DataStax provides a wide variety of resources for troubleshooting and other types of help:
- Troubleshooting
-
-
Support Knowledge Base (troubleshooting articles)
-
Submit a support ticket (registered users)
-
- Help
Advanced Functionality
In addition to all the capabilities of Apache Cassandra, DataStax Enterprise offers the following capabilities:
- DSE Analytics
-
Built on a production-certified version of Apache Spark™, with enhanced capabilities like AlwaysOn SQL for process streaming and historical data at cloud scale.
- DSE Graph
-
DSE Graph is optimized for storing billions of items and their relationships to enable you to identify and analyze hidden relationships between connected data and build powerful modern applications for real-time use cases: fraud detection, customer 360, social networks, IoT, and recommendation systems. The DSE Graph Quick Start is a great place to get started.
- DSE Search
-
Provides powerful search and indexing capabilities, including support for full-text, relevancy, sub-string, and fuzzy queries over large data sets, aggregation, and geospatial matchups.
- DSE OpsCenter
-
Provides visual management and monitoring for DataStax Enterprise, including automatic backups, reduced manual operations, automatic failover, patch release upgrades, and secure management of DSE clusters on-premises, in the cloud, or in hybrid environments that span multiple data centers.
- Lifecycle Manager
-
A visual provisioning and monitoring tool for DataStax Enterprise clusters. LCM allows you to define the cluster configuration including datacenter, node topology, and security. LCM monitoring helps you troubleshoot installation, configuration, and upgrade jobs.
- DSE Advanced Security
-
Provides fine-grained user and access controls to keep applications data protected and compliance with regulatory standards like PCI, SOX, HIPAA, and the European Union’s General Data Protection Regulation (GDPR).
- DSE Metrics Collector
-
Aggregates DSE metrics and integrates with existing monitoring solutions to facilitate problem resolution and remediation.
- DSE Management Services
-
DSE Management Services automatically handle administration and maintenance tasks and assist with overall database cluster management.
- NodeSync service
-
Continuous background repair that virtually eliminates manual efforts to run repair operations in a DataStax cluster.
- Advanced Replication
-
Advanced Replication allows a single cluster to have a primary hub with multiple spokes. This allows configurable, bi-directional distributed data replication to and from source and destination clusters.