Install and configure Apache Cassandra on ubuntu 16.04/18.04/AWS EC2 instance

Introduction

Apache Cassandra is a free and open-source NoSQL database management system that is designed to provide scalability, high availability, able to handle large amounts of data, and uncompromised performance. It uses a cluster model unlike Mysql, MSSQL.

In this tutorial, you’ll learn how to install and use cassandra to run a single-node cluster on Ubuntu 18.04.

Why Cassandra is Important

Cassandra is a NoSQL database manager that belongs to the Apache foundation. Cassandra is fast in handling stored data, however, its main advantage is scalability. Apache Cassandra is used by a number of organizations including Apple, NetFlix, eBay.

With Cassandra, it is very advantageous to create large applications with a lot of data . Besides this, it comes with fault tolerance which means that almost no data is lost in case of any system problems. So if we need scalability and high availability without compromising performance, Cassandra is ideal.

Before We begin

  • Ubuntu 18.04 system with root or non-root user with sudo privileges

Step 1: Install Java 8

Before you get on to installing Cassandra on Ubuntu, make sure you install Java 8, either the Oracle Java Standard Edition 8 or OpenJDK 8.

Here we will install Openjdk 8 as it is simple and easy.

First, Update your package repository :

sudo apt update

Install the OpenJDK package :

sudo apt install openjdk-8-jdk

Verify the Java installation by running the following command which will print the Java version:

java -version

The output should look something like this:

Now it is confirmed that we have OpenJDK java 8 installed.

If You want to know more about different Java Installation and configuration checkout our Java installation tutorials

Step 2: Install Python 2.7, if it’s missing on your system

Apache Cassandra requires Python 2.7 rather than Python 3. If you operate Apache Cassandra in a Python 3 environment, you may have trouble launching the cqlsh shell of Apache Cassandra.

So first check the python installation and version using command:

python -V

On ubuntu 18.04 LTS you will get output as

That actually means you need to install Python 2.7 by yourself:

sudo apt install python

After the complete installation ,Re-run the python -V command, and the output will become:

ubuntu@ip-10-0-1-107:~$ python -V
Python 2.7.17

Step 3: Install the latest stable release of Apache Cassandra

We’ll install Cassandra latest stable version using packages from the official Apache Software Foundation repositories, so start by adding the repo so that the packages are available to your system

Note:Please check the latest stable realease fo Apache Cassandra from the link https://cassandra.apache.org/download/.

At the time of writing this article the stable version is  3.11.7.

You need to add the Apache Cassandra repository and pull the GPG key before installing the database.

Enter the command below to add the Cassandra repository to the sources list(/etc/apt/sources.list.d/cassandra.sources.list) for 3.11 version :

echo "deb https://downloads.apache.org/cassandra/debian 311x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list

Note: You may install the Latest Beta Version  4.0 also. In that case, you need to replace 311x with 40x

The output returns to a new line with no message.

Next ,add the Apache Cassandra repository keys using curl or wget command :

curl https://downloads.apache.org/cassandra/KEYS | sudo apt-key add -

If you entered the command and the URL correctly, the output prints OK.

Now Once again update the repository package list :

sudo apt-get update

Note: If you encounter this error

GPG error: http://www.apache.org 311x InRelease: The following signatures couldn’t be verified because the public key is not available: NO_PUBKEY A278B781FE4B2BDA

Then add the public key A278B781FE4B2BDA as follows:

sudo apt-key adv –keyserver pool.sks-keyservers.net –recv-key A278B781FE4B2BDA

and repeat sudo apt-get update. The actual key may be different, you get it from the error message itself. For a full list of Apache contributors public keys, you can refer to https://downloads.apache.org/cassandra/KEYS.

Use the newly added apt repo to install Apache Cassandra:

sudo apt-get install cassandra

Step 4: Test the installation of Apache Cassandra

Cassandra will start automaticaly after the installation.Check the status of cassandra using :

sudo service cassandra status

Will get output look like :

Next, use the nodetool program to show the status of Apache Cassandra on current node:

nodetool status

If UN is displayed in the output, the cluster is working. Your output should resemble the following:

Configuration files of Apache Cassandra 

  • Apache Cassandra data is stored in the /var/lib/cassandra directory,
  • Configuration files are located in /etc/cassandra
  • The default location of log files are in /var/log/cassandra/
  •  Java start-up options can be configured in the /etc/default/cassandra file.

By default, Cassandra is configured to listen on localhost only. If the client connecting to the database is also running on the same host you don’t need to change the default configuration file.

To interact with Cassandra through CQL (the Cassandra Query Language) you can use a command line utility named cqlsh that is shipped with the Cassandra package.

cqlsh

You will get output look like :

It shows that the cassandra version is 3.11.7 and it is connected to localhost with port 9042.

For now, just type exit and then press ENTER to quit the cqlsh shell.

Manage Cassandra service

If you want to stop running casssandra service type:

sudo service cassandra stop

Start the service run the command:

sudo service cassandra start

To restart cassandra service run:

sudo service cassandra restart

Renaming Apache Cassandra Cluster

By default, the Cassandra cluster is named “Test Cluster”. If you want to change the name, follow the steps below:

Note: Note: We strongly advise to create a backup of your cassandra.yaml file if you intend to edit it. To do so, use this command:

sudo cp /etc/cassandra/cassandra.yaml /etc/cassandra/cassandra.yaml.backup

We used the /etc/cassandra directory as a destination for the backup, but you can change the path as per your wish.

Use a text editor of your choice to open the cassandra.yaml file (we will be using nano):

sudo nano /etc/cassandra/cassandra.yaml

Find the line that reads cluster_name: The default name is Test Cluster. That is the first change you want to make when you start working with Cassandra.

I am giving cluster name as ‘Tecnotes Cluster‘ for demo purpose.

If you do not want to make more changes, exit and save the file.

Login to the Cassandra CQL terminal with cqlsh:

cqlsh

Run the following command to change the cluster name to “Tecnotes Cluster”:

UPDATE system.local SET cluster_name = 'Tecnotes Cluster' WHERE KEY = 'local';

Change “Tecnotes Cluster” with your desired name. Once done type exit to exit the console.

Run the following command to clear the system cache.This command will not disturb your node’s data.

nodetool flush system

Finally restart the Cassandra service:

sudo service cassandra restart

Log in with cqlsh and verify the new cluster name is visible.

Will get out put look like :

Conclusion

Cassandra is a very useful database manager that we can take advantage of it.In this tutorial, we have learned how to install Cassandra on Ubuntu 18.04.Additionally, we showed you how to edit the important parameters in the Cassandra configuration file along with usage of cqlsh utility.

If you want to learn more, we recommend checking the official documentation!

Leave a Reply