How to download and install Apache Cassandra using tarball on Ubuntu 16.04/18.04/AWS ec2

Introduction

In the previous tutorial, we discussed how to install and access Apache Cassandra from the Debian packages. In this tutorial, we will learn how to install Cassandra from the downloaded tarball. It is a simple method. If you want to check out our previous tutorial go to Install Apache Cassandra from deb packages

Installation method:

For most users, installing the binary tarball is the simplest choice. The tarball unpacks all its contents into a single location with binaries and configuration files located in their own subdirectories. The most obvious attribute of the tarball installation is it does not require root permissions and can be installed on any Linux distribution.

Before We Begin

Since we have Discussed the Installation of Java 8 and python 2.7 in the previous tutorial, for this tutorial we are skipping that steps. Please checkout on Install Apache Cassandra from deb packages to install the prerequisite for this tutorial and follow this section.

Installing the binary tarball

1.Verify the installation of java and python

java -version
openjdk version “1.8.0_265”
OpenJDK Runtime Environment (build 1.8.0_265-8u265-b01-0ubuntu2~18.04-b01)
OpenJDK 64-Bit Server VM (build 25.265-b01, mixed mode)

python -V
Python 2.7.17

2. Download the binary tarball from one of the mirrors on the Apache Cassandra Download site.

Here we will Download the latest Apache Cassandra 3.11 release: 3.11.8

You can see the mirror sites like below :

Here I am selecting the first mirror site for the download :

http://apachemirror.wuchna.com/cassandra/3.11.8/apache-cassandra-3.11.8-bin.tar.gz

Use curl to download the file :

curl -OL http://apachemirror.wuchna.com/cassandra/3.11.8/apache-cassandra-3.11.8-bin.tar.gz

3. (OPTIONAL)Verify the integrity of the downloaded tarball:

Verify the integrity of the downloaded tarball using one of the methods here. For example, to verify the hash of the downloaded file using GPG:

gpg –print-md SHA256 downloaded_file(replace with our downloded filename)

gpg --print-md SHA256 apache-cassandra-3.11.8-bin.tar.gz

You will get output without any prompt like :

apache-cassandra-3.11.8-bin.tar.gz: 3D04E4B7 9C3F264C B491E1BC 5127EC46 5102B550
05A6C8F4 AF8548B3 2E74BF50

Now Compare the signature with the SHA256 file from the Downloads site:

Note:Normally we can use  backup mirrors to download KEYS, PGP signatures and hashes (SHA* etc) instead of the original downloaded mirror sites

https://downloads.apache.org/cassandra/3.11.8/apache-cassandra-3.11.8-bin.tar.gz

So our command will look like :

curl -L https://downloads.apache.org/cassandra/3.11.8/apache-cassandra-3.11.8-bin.tar.gz.sha256

Will get response with the same keys generated as a result of previous command :

3d04e4b79c3f264cb491e1bc5127ec465102b55005a6c8f4af8548b32e74bf50

4.Unpack the tarball:

Now just unpack the tarball using tar command as below

tar xzvf apache-cassandra-3.11.8-bin.tar.gz

Note: You can also copy the Cassandra archive to the desired installation directory. For example, /usr/local.For this tutorial I have downloaded file in home directory and that will be the installation directory

The files will be extracted to the apache-cassandra-3.11.8-bin.tar.gz directory. This is the tarball installation location.

 This tarball location contains directories for the scripts, binaries, utilities, configuration, data, and log files:

CASSANDRA-14092.txt CHANGES.txt LICENSE.txt NEWS.txt NOTICE.txt bin conf doc interface javadoc lib pylib tools

Note :If you download cassandra using tarball ,the  Cassandra configuration files will be in conf directory within the tarball install location

You can add directories for data, commitlog, and saved_caches. You can create these directories anywhere or in the default locations configured in the Cassandra_install_dir/conf/cassandra.yaml file. For example:

  • /var/lib/cassandra/data
  • /var/lib/cassandra/commitlog
  • /var/lib/cassandra/saved_caches

Also can add a directory for logging. You can create this directory anywhere, such as /var/log/cassandra/.

To know more about cassandra configuration please visit Cassandra official documentaion

Starting Cassandra as a stand-alone process

Cassandra’s default configuration file, cassandra.yaml, is sufficient to explore a simple single-node cluster.

To start Cassandra in the foreground:

cd install_location (here it is ~/apache-cassandra-3.11.8,since we downloded the file in home directory).So we have to run
cd  apache-cassandra-3.11.8
bin/cassandra -f

To monitor the progress of the startup :

tail -f logs/system.log

Cassandra is ready when it shows an entry like this in the system.log:

INFO [main] 2020-09-04 12:18:11,351 Server.java:159 – Starting listening for CQL clients on localhost/127.0.0.1:9042 (unencrypted)…

To start Cassandra in the background:

cd  apache-cassandra-3.11.8

Then type below command to start cassandra:

bin/cassandra

Stopping Cassandra as a stand-alone process

To stop the Cassandra Java server process on tarball installation do the following steps :

Find the Cassandra Java process ID (PID), and then kill the process using its PID number:

ps auwx | grep cassandra

You can see the output as below :

sudo kill pid here it is 7889,so command will be :
sudo kill -9 7889

If you again run the ps command to check the java process ,you cannot see the process as it is killed

Check Cassandra status using nodetool

To check the status of Cassandra we can use nodetool command as below :

bin/nodetool status

The status column in the output should report UN which stands for Up/Normal.

Conclusion:

In this session we learned how to install and configure Apache Cassandra using downloaded tarball.We also discussed how to start and configure the basics of cassandra to run as single node.in upcoming sessions we will learn how to configure cassandra to run in multi-node cluster environment

To know more about Aapche Cassandra please check Apache cassandra official documentaion

Leave a Reply