How can I install Apache Cassandra 4.0 on CentOS 8 | Rocky Linux 8 machine?. Apache Cassandra is a free and open-source NoSQL database management system designed to be distributed and highly available. Cassandra can handle large amounts of data across many commodity servers without any single point of failure.

This guide will walk you through the installation of Cassandra on CentOS 8 | Rocky Linux 8. After installation is done, we’ll proceed to do configurations and tuning of Cassandra to work with machines having minimal resources available.

Features of Cassandra

Cassandra provides the Cassandra Query Language (CQL), an SQL-like language, to create and update database schema and access data. CQL allows users to organize data within a cluster of Cassandra nodes using:

  • Keyspace: defines how a dataset is replicated, for example in which datacenters and how many copies. Keyspaces contain tables.
  • Table: defines the typed schema for a collection of partitions. Cassandra tables have flexible addition of new columns to tables with zero downtime. Tables contain partitions, which contain partitions, which contain columns.
  • Partition: defines the mandatory part of the primary key all rows in Cassandra must have. All performant queries supply the partition key in the query.
  • Row: contains a collection of columns identified by a unique primary key made up of the partition key and optionally additional clustering keys.
  • Column: A single datum with a type which belong to a row.

Cassandra has support for the following client drivers:

  • Java
  • Python
  • Ruby
  • C# / .NET
  • Nodejs
  • PHP
  • C++
  • Scala
  • Clojure
  • Erlang
  • Go
  • Haskell
  • Rust
  • Perl
  • Elixir
  • Dart

Install Apache Cassandra 4.0 on CentOS 8 | Rocky Linux 8

Java is required for running Cassandra on CentOS 8 | Rocky Linux 8. As of this writing, required version of Java is 8. If you want to use cqlsh, you need the latest version of Python 2.7.

Step 1: Install Java 8 and Python:

sudo yum -y install epel-release python2 python2-pip java-1.8.0-openjdk
sudo pip2 install cqlsh

Confirm the installation of Java and Python.

$ java -version
openjdk version "1.8.0_302"
OpenJDK Runtime Environment (build 1.8.0_302-b08)
OpenJDK 64-Bit Server VM (build 25.302-b08, mixed mode)

$ python2.7 --version
Python 2.7.16

Step 2: Install Apache Cassandra 4.0 on CentOS 8 | Rocky Linux 8

Now that Java and Python are installed. Let’s now add Cassandra repository to our CentOS / Rocky system.

sudo tee  /etc/yum.repos.d/cassandra.repo <<EOF
name=Apache Cassandra

Install Apache Cassandra with the command below.

sudo yum -y install cassandra

Create Cassandra service.

sudo tee /etc/systemd/system/cassandra.service<<EOF
Description=Apache Cassandra

ExecStart=/usr/sbin/cassandra -f -p /var/run/cassandra/


Start and enable service to start at boot.

sudo systemctl daemon-reload
sudo systemctl start cassandra.service
sudo systemctl enable cassandra

Check service status:

$ systemctl status cassandra.service
● cassandra.service - Apache Cassandra
   Loaded: loaded (/etc/systemd/system/cassandra.service; disabled; vendor preset: disabled)
   Active: active (running) since Wed 2020-03-04 22:24:31 EAT; 2s ago
 Main PID: 8758 (java)
    Tasks: 10 (limit: 26213)
   Memory: 3.9G
   CGroup: /system.slice/cassandra.service
           └─8758 java -Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003 -XX:+AlwaysPreTouch -XX:-Us>

Mar 04 22:24:31 cent8.localdomain systemd[1]: Started Apache Cassandra.

You can also verify that Cassandra is running with the command below.

$ nodetool status
Datacenter: datacenter1
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  70 KiB     256          100.0%            0daf41fa-22e5-4471-bc00-9aed6f566235  rack1

To run a query against Cassandra, invoke the CQL shell with below command.

$ cqlsh
Connected to Test Cluster at
[cqlsh 5.0.1 | Cassandra 4.0.0 | CQL spec 3.4.5 | Native protocol v5]
Use HELP for help.
  • The default location of configuration files is /etc/cassandra.
  • The default location of log and data directories is /var/log/cassandra/ and /var/lib/cassandra.

Configuring Cassandra on CentOS 8 | Rocky Linux 8

For running Cassandra on a single node, the default configuration file present at /etc/cassandra/conf/cassandra.yaml. For cluster of nodes setup, you may need to modify this file to ensure your cluster is tuned properly.

At a minimum you should consider setting the following properties:

  • cluster_name: the name of your cluster.
  • seeds: a comma separated list of the IP addresses of your cluster seeds.
  • storage_port: you don’t necessarily need to change this but make sure that there are no firewalls blocking this port.
  • listen_address: the IP address of your node, this is what allows other nodes to communicate with this node so it is important that you change it.
  • native_transport_port: as for storage_port, make sure this port is not blocked by firewalls as clients will communicate with Cassandra on this port.

Changing the location of directories

The configuration yaml file controls the following data directories.

  • data_file_directories: one or more directories where data files are located.
  • commitlog_directory: the directory where commitlog files are located.
  • saved_caches_directory: the directory where saved caches are located.
  • hints_directory: the directory where hints are located.

For performance reasons, if you have multiple disks, consider putting commitlog and data files on different disks.

Setting Environment variables

The JVM level settings such as heap size are set in the Consider adding any additional JVM command line argument to the JVM_OPTS environment variable. These arguments are passed to Cassandra service when it starts.

Cassandra Logging

The logger in use is logback. You can change logging properties by editing logback.xml. By default it will log at INFO level into a file called system.log and at debug level into a file calle debug.log. When running in the foreground, it will also log at INFO level to the console.

Refer to official guide for Clients configuration.

Your support is our everlasting motivation,
that cup of coffee is what keeps us going!

As we continue to grow, we would wish to reach and impact more people who visit and take advantage of the guides we have on our blog. This is a big task for us and we are so far extremely grateful for the kind people who have shown amazing support for our work over the time we have been online.

Thank You for your support as we work to give you the best of guides and articles. Click below to buy us a coffee.


Please enter your comment!
Please enter your name here