Posts Tagged ‘cassandra’

Real Time Data Replication from Oracle to DataStax Cassandra using Oracle GoldenGate 23ai DAA

Posted by FatDBA on March 16, 2026

Modern enterprise architectures rarely live inside a single database engine anymore. It is very common to see operational data stored in relational systems like Oracle while analytics or distributed applications rely on NoSQL platforms such as Cassandra.

In several real-world projects I have worked on, Cassandra clusters were used to power high-volume applications while Oracle remained the authoritative system of record. In these cases, Oracle GoldenGate D23ai DAA becomes a powerful bridge that streams transactional changes from Oracle redo logs directly into Cassandra in near real time.

In this article we will walk through a complete end-to-end implementation of Oracle → DataStax Cassandra replication using Oracle GoldenGate.

The goal is to demonstrate:

• Initial data load
• Continuous change capture
• Replication of DML operations
• Validation on the Cassandra side

Source System : Oracle Database Server
Hostname: ora-prod01.lab.internal
IP: 10.10.10.11

Operating System: Oracle Linux 8
Database Version: Oracle 19c (19.21 RU)
GoldenGate Version: 23ai Classic Deployment

Target System: DataStax Enterprise Cassandra Node
Hostname: cass-node01.lab.internal
IP: 10.10.10.21

Operating System: Rocky Linux 8
Cassandra Distribution: DataStax Enterprise 6.8.50

GoldenGate Big Data / Java Adapter is installed on the Cassandra side.

Step 1 – Prepare Oracle Source Database

Oracle GoldenGate captures data changes directly from the Oracle redo logs, allowing near real-time replication without querying tables. To support this, the database must have ARCHIVELOG mode, force logging, and supplemental logging enabled so that all row-level changes are fully recorded and available for capture.

sqlplus / as sysdba

archive log list;
Database log mode              No Archive Mode
Automatic archival             Disabled

-- enable archivelog mode
shutdown immediate;
startup mount;
alter database archivelog;
alter database open;

-- enable force logging
alter database force logging;

-- enable supp logging 
alter database add supplemental log data;

-- enabled GG parameter in source database
alter system set enable_goldengate_replication=true scope=both;

-- streams pool size 
alter system set streams_pool_size=2G scope=both;

Step 2 – Create GoldenGate User in Oracle

A dedicated database user is required for GoldenGate to capture changes from the Oracle redo logs and access the necessary metadata. This user is granted the required privileges to read transaction logs, access dictionary information, and manage replication objects.

CREATE TABLESPACE ogg_ts 
DATAFILE '/u02/oradata/ogg_ts01.dbf' 
SIZE 500M AUTOEXTEND ON;

CREATE USER ggadmin IDENTIFIED BY "Welcome#123"
DEFAULT TABLESPACE ogg_ts
TEMPORARY TABLESPACE temp;

GRANT CONNECT, RESOURCE TO ggadmin;
GRANT SELECT ANY TABLE TO ggadmin;
GRANT SELECT ANY TRANSACTION TO ggadmin;
GRANT SELECT ANY DICTIONARY TO ggadmin;
GRANT FLASHBACK ANY TABLE TO ggadmin;
GRANT ALTER SYSTEM TO ggadmin;
GRANT UNLIMITED TABLESPACE TO ggadmin;


-- Granted OGG admin privs 
exec dbms_goldengate_auth.grant_admin_privilege(
   grantee => 'GGADMIN',
   privilege_type => 'CAPTURE',
   grant_select_privileges => TRUE,
   do_grants => TRUE);

Step 3 – Create Source Table in Oracle

To validate the replication pipeline, create a sample table and insert a few test records in the Oracle source database. These records will later be used to verify that GoldenGate successfully captures and replicates the data to the DataStax Cassandra target.

CREATE TABLE SALES_TRANSACTIONS
(
   TXN_ID NUMBER PRIMARY KEY,
   CUSTOMER_NAME VARCHAR2(100),
   PRODUCT_CODE VARCHAR2(50),
   TXN_AMOUNT NUMBER,
   TXN_TIMESTAMP DATE
);

INSERT INTO SALES_TRANSACTIONS VALUES
(101,'John Carter','PRD-A',1200,SYSDATE);

INSERT INTO SALES_TRANSACTIONS VALUES
(102,'Alice Brown','PRD-B',850,SYSDATE);

INSERT INTO SALES_TRANSACTIONS VALUES
(103,'Michael Lee','PRD-C',430,SYSDATE);

COMMIT;


select * from SALES_TRANSACTIONS;

TXN_ID CUSTOMER_NAME PRODUCT_CODE TXN_AMOUNT TXN_TIMESTAMP
-----------------------------------------------------------
101    John Carter   PRD-A        1200       05-MAR-26
102    Alice Brown   PRD-B        850        05-MAR-26
103    Michael Lee   PRD-C        430        05-MAR-26

Step 4 – Install DataStax Cassandra

On the target node install DataStax Enterprise. Example repository configuration:

sudo vi /etc/yum.repos.d/datastax.repo

[datastax]
name=DataStax Repo
baseurl=https://rpm.datastax.com/enterprise
enabled=1
gpgcheck=0

-- Install DSE 
sudo yum install dse-full-6.8.50 -y

-- Start DSE 
sudo systemctl start dse

-- Veroify cluster/nodes status 
nodetool status
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal

--  Address       Load       Tokens Owns  Host ID                               Rack
UN  10.10.10.21   120 KB     256    100%  8c34a0c1-b8c1-4b35-a6a7-9c2d8bfae111   rack1

Step 5 – Create Cassandra Keyspace

In DataStax Cassandra, a keyspace is the top-level namespace that defines how data is replicated across the cluster. Before creating tables for replication, we first create a keyspace and configure its replication strategy and replication factor.

cqlsh 10.10.10.21 <port> -u xxxx -p xxx

CREATE KEYSPACE retail_ks
WITH replication = {
'class': 'NetworkTopologyStrategy',
'DC1': 1
};

-- verify 
DESCRIBE KEYSPACES;


-- Created target tables in cassandra
USE retail_ks;

CREATE TABLE sales_transactions
(
   txn_id int,
   customer_name text,
   product_code text,
   txn_amount decimal,
   txn_timestamp timestamp,
   PRIMARY KEY(txn_id)
);

-- verofy table structure 
DESCRIBE TABLE retail_ks.sales_transactions;

Step 6 – Install GoldenGate on Oracle

Oracle GoldenGate software must be installed on the source server to capture database changes. In production environments, the silent installation method is commonly used because it allows automated and repeatable deployments without manual interaction.

-- Unpack GoldenGate binaries.
unzip 231000_fbo_ggs_Linux_x64_Oracle.zip

-- Run silent installation. Respoonse file
INSTALL_OPTION=ora23ai
SOFTWARE_LOCATION=/u01/app/ogg
START_MANAGER=true
MANAGER_PORT=7809
DATABASE_LOCATION=/u01/app/oracle/product/19c


-- Install
./runInstaller -silent -responseFile ogg.rsp

-- LaUnch OGG
cd /u01/app/ogg
./ggsci

GGSCI> INFO ALL
Program     Status
MANAGER     RUNNING

Step 7 – Enable Table Level Replication

GGSCI> DBLOGIN USERID ggadmin PASSWORD Welcome#123

GGSCI> ADD TRANDATA SALES_TRANSACTIONS
OGG-15132 Logging supplemental redo data enabled

Step 8 – Configure Extract

Create parameter file.

GGSCI> EDIT PARAMS EXT_SALES

EXTRACT EXT_SALES
USERID ggadmin PASSWORD Welcome#123
EXTTRAIL ./dirdat/st
TRANLOGOPTIONS INTEGRATEDPARAMS (PARALLELISM 4)
TABLE SALES_TRANSACTIONS;


-- Create extract
ADD EXTRACT EXT_SALES INTEGRATED TRANLOG BEGIN NOW
ADD EXTTRAIL ./dirdat/st EXTRACT EXT_SALES

-- start & verifgy its status 
START EXTRACT EXT_SALES
INFO EXTRACT EXT_SALES

EXTRACT    EXT_SALES     Last Started 2026-03-15 21:48   Status RUNNING
Checkpoint Lag           00:00:00 (updated 00:00:04 ago)
Process ID               32418
Log Read Checkpoint      Oracle Redo Logs
                        SCN  845732918 (0x0000.3267a3b6)
                        2026-03-15 21:48:12

Step 9 – Configure Data Pump

The GoldenGate Data Pump reads captured changes from the local trail files and forwards them to the target GoldenGate deployment. This process improves reliability and allows data to be securely transported to the target environment.

EDIT PARAMS PUMP_SALES

EXTRACT PUMP_SALES
RMTHOST 10.10.10.21, MGRPORT 7809
RMTTRAIL ./dirdat/rt
TABLE SALES_TRANSACTIONS;

-- add pump
ADD EXTRACT PUMP_SALES EXTTRAILSOURCE ./dirdat/st
ADD RMTTRAIL ./dirdat/rt EXTRACT PUMP_SALES

-- Start Pump
START EXTRACT PUMP_SALES
INFO PUMP_SALES

EXTRACT    PUMP_SALES    Last Started 2026-03-15 21:48   Status RUNNING
Checkpoint Lag           00:00:00 (updated 00:00:02 ago)
Process ID               32455
Log Read Checkpoint      File ./dirdat/es000000000
                        RBA 18425

Step 10 – Configure Cassandra Handler

On the GoldenGate Big Data / DAA node, configure the Cassandra handler properties so that the Replicat process knows how to connect to the DataStax Cassandra cluster and apply incoming changes. This configuration defines connection details, consistency level, datacenter information, and the Java classpath required for the Cassandra driver.

This configuration allows the GoldenGate Replicat process to establish a connection with the Cassandra cluster and apply transactional changes coming from the Oracle source database.

[oggadmin@cass-hub01 ~]$ vi /u01/app/ogg_23ai/dirprm/cassandra.props
gg.handlerlist=cassandra

gg.handler.cassandra.type=cassandra
gg.handler.cassandra.mode=op
gg.handler.cassandra.contactPoints=10.20.30.41
gg.handler.cassandra.port=9042
gg.handler.cassandra.keyspace=retail_ops_ks
gg.handler.cassandra.consistencyLevel=LOCAL_QUORUM
gg.handler.cassandra.cassandraMode=async
gg.handler.cassandra.compressedUpdates=true
gg.handler.cassandra.datacenter=DC_MAIN
gg.log=log4j
gg.log.level=INFO
gg.report.time=30sec
gg.classpath=/opt/datastax/cassandra/resources/cassandra/lib/*:/u01/app/ogg_23ai/DependencyDownloader/dependencies/cassandra_4.1/*
javawriter.stats.full=true
javawriter.stats.display=TRUE

-- If authentication is enabled in DataStax Cassandra, the property file can also include credentials depending on the adapter version and security model being used.
-- exampl;e
gg.handler.cassandra.username=oggsvc
gg.handler.cassandra.password=Str0ngPassw0rd

Step 11 – Configure Replicat

At this stage, the source-side Extract and Data Pump are already running and writing trail records to the remote trail on the target GoldenGate deployment. The next step is to configure the Replicat process on the Cassandra side so that trail records can be applied into the target DataStax Cassandra keyspace and table.

In a GoldenGate Big Data or DAA style deployment, the first thing you should always do is connect to the deployment administration interface and verify that the deployment is up before creating the Replicat.

-- Loginto deployment
[oggadmin@cass-hub01 ~]$ cd /u01/app/ogg_23ai/bin
[oggadmin@cass-hub01 bin]$ ./adminclient

Oracle GoldenGate Admin Client for Oracle
Version 23.10.0.25.10 OGGCORE_23.10.0.0.0_PLATFORMS_251018.0830

Copyright (C) 1995, 2025, Oracle and/or its affiliates. All rights reserved.

Linux, x64, 64bit (optimized), Generic on Oct 18 2025 14:00:54
Operating system character set identified as UTF-8.


-- connect to the deployment
OGG (not connected) 1> CONNECT http://10.10.40.25:9001 deployment CassandraDAA as oggadmin password "xxxxxx";
Successfully connected to CassandraDAA.

Before creating the Replicat, verify that the remote trail files are arriving from the Data Pump. This is a simple but important validation step. If the trail is not being written, the Replicat will start but will have nothing to process.

[oggadmin@cass-hub01 ~]$ cd /u01/app/ogg_23ai/var/lib/data
[oggadmin@cass-hub01 data]$ ls -ltr rt*

-rw-r----- 1 oggadmin oinstall  1048576 Mar 15 20:11 rt000000000
-rw-r----- 1 oggadmin oinstall   524288 Mar 15 20:16 rt000000001

This confirms that the remote trail prefix rt is present and the target deployment is receiving records from the source-side Pump process.

Next lets create the replicate.

OGG (http://10.10.40.25:9001 CassandraDAA as oggadmin@CassandraDAA) 3> EDIT PARAMS REP_CASS

REPLICAT REP_CASS
TARGETDB LIBFILE libggjava.so SET property=dirprm/cassandra.props
REPORTCOUNT EVERY 1 MINUTES, RATE
GROUPTRANSOPS 1000
DISCARDFILE ./var/lib/data/REP_CASS.dsc, APPEND, MEGABYTES 500
DISCARDROLLOVER AT 02:00
MAP APP_OWNER.SALES_ORDERS_SRC,
TARGET retail_ops_ks.sales_orders_rt;

-- add the repl process
OGG (http://10.10.40.25:9001 CassandraDAA as oggadmin@CassandraDAA) 4> ADD REPLICAT REP_CASS, EXTTRAIL ./var/lib/data/rt
REPLICAT added.

-- staret the process
OGG (http://10.10.40.25:9001 CassandraDAA as oggadmin@CassandraDAA) 5> START REPLICAT REP_CASS
Sending START request to Manager ...
REPLICAT REP_CASS starting.

REPLICAT   REP_CASS      Last Started 2026-03-15 21:49   Status RUNNING
Checkpoint Lag           00:00:00 (updated 00:00:01 ago)
Process ID               32504
Log Read Checkpoint      File ./dirdat/rt000000000
                        RBA 18874

Params explanations …

REPLICAT REP_CASS defines the Replicat group name.

TARGETDB LIBFILE libggjava.so SET property=dirprm/cassandra.props tells GoldenGate to use the Java-based Big Data adapter library and load all Cassandra handler settings from the referenced properties file.

REPORTCOUNT EVERY 1 MINUTES, RATE prints transaction statistics regularly into the report file, which is very useful during initial testing and performance validation.

GROUPTRANSOPS 1000 batches operations to improve throughput. This value can be tuned depending on transaction profile, row size, and Cassandra write latency.
DISCARDFILE captures records that fail processing, while DISCARDROLLOVER keeps the discard file manageable.
The MAP statement defines the source Oracle table and the target DataStax Cassandra table.

In this example:
Oracle source table: APP_OWNER.SALES_ORDERS_SRC
Cassandra target table: retail_ops_ks.sales_orders_rt

Step 12 – Testing End-to-End Replication (Oracle to DataStax Cassandra)

In this section we will validate that the replication pipeline is working correctly by performing a few test transactions on the Oracle source database and confirming that the changes are applied on the DataStax Cassandra target table.

I am not going to discuss the initial load process here since it is relatively straightforward and typically performed using standard GoldenGate initial load methods or bulk data movement tools. For the purpose of this test, we assume that the base dataset has already been loaded into Cassandra and that GoldenGate is responsible only for capturing and replicating incremental changes.

-- Source DB

SQL*Plus: Release 19.0.0.0.0 - Production
Version 19.21.0.0.0

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0

SELECT * FROM SALES_ORDERS_SRC;
ORDER_ID   CUSTOMER_NAME   ORDER_STATUS   ORDER_AMOUNT
--------   -------------   ------------   ------------
1001       NORTH RETAIL    CREATED        450
1002       ALPHA STORES    SHIPPED        720
1003       OMEGA MART      CREATED        610

On Target/Cassandra cluster

cqlsh 10.20.30.41 9042 -u cassandra -p cassandra
Connected to RetailCluster at 10.20.30.41:9042
[cqlsh 6.1.0 | DSE 6.8.x | CQL spec 3.4.6]

USE retail_ops_ks;

SELECT * FROM sales_orders_rt;
order_id | customer_name | order_status | order_amount
---------+---------------+--------------+--------------
1001     | NORTH RETAIL  | CREATED      | 450
1002     | ALPHA STORES  | SHIPPED      | 720
1003     | OMEGA MART    | CREATED      | 610

-- This confirms that the initial dataset already exists on both systems.

Lets do some changes on source and see how it reaches on target.

INSERT INTO SALES_ORDERS_SRC
VALUES (1004,'PACIFIC TRADING','CREATED',980);
COMMIT;

1 row created.

Commit complete.



-- Veroify OGG 
INFO EXTRACT EXT_SALES
EXTRACT EXT_SALES Last Started 2026-03-16 20:12 Status RUNNING
Checkpoint Lag 00:00:00
Log Read Checkpoint
SCN 845736122

-- scn moVED. This confirms that the Extract process captured the transaction from the Oracle redo logs.

Now lets check it on target

SELECT * FROM sales_orders_rt
WHERE order_id = 1004;

order_id | customer_name    | order_status | order_amount
---------+------------------+--------------+--------------
1004     | PACIFIC TRADING  | CREATED      | 980

The inserted record has successfully replicated.

By performing these test operations, we validated that:

• GoldenGate Extract captured Oracle redo changes
• Data Pump delivered trail files to the target deployment
• Replicat applied changes into DataStax Cassandra

The Oracle —> GoldenGate —-> Cassandra replication pipeline is now functioning correctly.

Hope It Helped!
Prashant Dixit

Posted in Uncategorized | Tagged: 23ai, bigdata, cassandra, daa, goldengate, migration, ogg, renaps, replication | Leave a Comment »

Datastax Certified Cassandra Administrator, some tips & more

Posted by FatDBA on August 21, 2020

Hi Guys,

With a sharp rise in NoSQL databases, many of the organizations are making a transition from traditional databases to distributed and high performance databases like ‘Cassandra’. Cassandra has become Apache’s one of the most popular projects. Though there are multiple NoSQL databases available in the market but no one has the features like peer-to-peer architecture, HA and Fault tolerant, Column based, Highly perform-ant, Schema Less, tunable consistency, great analytical possibilities, easy to scale-up & scale-down, distributed and the list goes on and on and on.

Cassandra already proved it’s mettle and is magical for IoT, Sensor data, Event based, Time series data, voucher generation systems and with other data models. Datastax provides best in class database management software and wide-range services with 24×7 support to get more from your Cassandra. Alongside comes some really cool features and tools i.e. opscenter (GUI), Nodesync (for enti entropy repairs), great SOLR integration, dsetool (similar to nodetool with more capabilities), sstableloader, pre-flight check tool, yaml file compare tools, stress tools, extra commands i.e. dsefs and many more.

DataStax is a pioneer and they have their own Cassandra certification path/track to prove you have valid credentials to work with Cassandra database either as a developer or an administrator. Now question comes where to start ?? – In fact many of you have asked me about my latest credentials ‘Datastax Apache Cassandra 3.x Administrator Associate‘, I was getting questions like how to prepare, how to book the exam and many other related questions. So, this post will be all about covering topics like how to prepare and book exam along with few tips.

I would always prefer to go point wise to make things more ordered and easy to digest.

1. Create your account on Datastax Academy.
Link: https://auth.cloud.datastax.com/auth/realms/CloudUsers/login-actions/registration?client_id=absorb&tab_id=lv4-57nRbu4

2. Go to the option ‘Catalog’ to lookout for courses available.
You have to choose between the Administrator (3 course based curriculum) or Developer (3 Courses based curriculum) track. I have completed the ADMIN path and it has three courses DS101 (Introduction), DS201 (Foundations) and DS210 (Operations with Apache Cassandra). All of the courses are beautifully designed, contains large numbers of demos, presentations, guides, quiz and a pre-build Ubuntu VM where you can all exercises.

Though the presentations and program covers every topic and all major parameters and topics but still if you want to read in depth, they have their own document collection and can be accessed through their website https://docs.datastax.com/en/landing_page/doc/landing_page/current.html or from https://cassandra.apache.org/doc/latest/

Note: There are few other specialized courses available too within the catalog i.e. Kafka connectors, DSE Graph, DSE Analytics, DSE Search etc.

3. Other learning platforms
– Github: https://github.com/datastax
Can be very useful specially if you are preparing for developer track.
– Youtube: Full of some great presentations, videos and some precious workshops and demos.
https://www.youtube.com/user/DataStaxMedia
– Twitter: For news (about webinars etc.), press releases and other exciting information.
https://twitter.com/DataStax (@DataStax)

4. All set!
Once you are done with your all three of your courses under ADMIN track, you are done and ready for the certification. Go to ‘Datastax Certification’ widget within catalog and book your exam by creating your profile on their certification website.
https://certification.mettl.com/datastax/applicant/signup

Currently they are giving one free exam vouchers and those will be issued at the end of the series for participants of the workshop.

5. Once registered you have to choose your exam type – Admin or Developer.
Both of the exams has 60 questions that you have to complete within 90 minutes, exam fees (right now) is $145
Note: It’s good that you check your system comparability before the exam, for more details follow their official guidelines.

So, don’t wait, go and enroll for the course and grab a chance for giving free certification and more importantly stand out from the crowd. These widely accepted and recognized credentials will help you in your continued professional development and is an ideal way to gain a greater understanding of your industry, and to enhance your knowledge and skills. It also offers excellent chances to network among Cassandra geeks.

Hope It Helps!
Prashant Dixit

Posted in Basics, Uncategorized | Tagged: cassandra | Leave a Comment »

Passed ‘Datastax Apache Cassandra 3.x Administrator Associate’ Certification

Posted by FatDBA on August 21, 2020

Hi Guys,

News to share, today itself I’ve cleared the ‘Datastax Apache Cassandra 3.x Administrator Associate‘ Certification. Was working towards Cassandra from past few months now, and it’s always good to get certified from the market leaders. Good way to stand out from the crowd!

Will soon write a blog about the preparation and the process to get certified from Datastax, stay tuned!

Hope It Helps!
Prashant D

Posted in Basics | Tagged: cassandra | Leave a Comment »

Cassandra node refused to start – MismatchedInputException: No content to map due to end-of-input

Posted by FatDBA on August 5, 2020

Hi All,

This one will be a quick one, kind of error and solution approach.
This time it’s about Cassandra statup error which I have faced while doing some testing on a POC box and how I resolved it.

In this case I was using Datastax Cassandra 6.8.1 on RHEL6, I tried starting the instance but failed with error which says “MismatchedInputException: No content to map due to end-of-input”.
I’ve attempted to start it few more times but ended up with the same error message. Below is the complete/full error details.


WARN  [main] 2020-08-05 13:43:46,277  DatabaseDescriptor.java:1517 - JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info.
INFO  [main] 2020-08-05 13:43:46,281  DseDelegateSnitch.java:39 - Setting my workloads to [Cassandra]
INFO  [main] 2020-08-05 13:43:46,358  YamlConfigurationLoader.java:77 - Configuration location: file:/home/cassandra/dse-6.8.1_node1/resources/cassandra/conf/cassandra.yaml
INFO  [main] 2020-08-05 13:43:46,360  DseDelegateSnitch.java:41 - Initialized DseDelegateSnitch with workloads [Cassandra], delegating to com.datastax.bdp.snitch.DseSimpleSnitch
INFO  [main] 2020-08-05 13:43:46,424  YamlConfigurationLoader.java:77 - Configuration location: file:/home/cassandra/dse-6.8.1_node1/resources/cassandra/conf/cassandra.yaml
ERROR [main] 2020-08-05 13:43:47,904  DseModule.java:126 - Unable to start server. Exiting...
org.apache.cassandra.io.FSReadError: com.fasterxml.jackson.databind.exc.MismatchedInputException: No content to map due to end-of-input
 at [Source: (BufferedInputStream); line: -1, column: 0]
        at com.datastax.bdp.db.nodes.Nodes.transactionalRead(Nodes.java:195)
        at com.datastax.bdp.db.nodes.Nodes.access$100(Nodes.java:76)
        at com.datastax.bdp.db.nodes.Nodes$Local.(Nodes.java:477)
        at com.datastax.bdp.db.nodes.Nodes.(Nodes.java:97)
        at com.datastax.bdp.db.nodes.Nodes$Instance.setup(Nodes.java:911)
        at org.apache.cassandra.config.DatabaseDescriptor.applyMetadata(DatabaseDescriptor.java:587)
        at org.apache.cassandra.config.DatabaseDescriptor.daemonInitialization(DatabaseDescriptor.java:234)
        at org.apache.cassandra.config.DatabaseDescriptor.daemonInitialization(DatabaseDescriptor.java:249)
        at com.datastax.bdp.DseCoreModule.(DseCoreModule.java:84)
        at com.datastax.bdp.DseModule.getRequiredModules(DseModule.java:163)
        at com.datastax.bdp.server.AbstractDseModule.configure(AbstractDseModule.java:27)
        at com.datastax.bdp.DseModule.configure(DseModule.java:89)
        at com.google.inject.AbstractModule.configure(AbstractModule.java:62)
        at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340)
        at com.google.inject.spi.Elements.getElements(Elements.java:110)
        at com.google.inject.internal.InjectorShell$Builder.build(InjectorShell.java:138)
        at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:104)
        at com.google.inject.Guice.createInjector(Guice.java:96)
        at com.google.inject.Guice.createInjector(Guice.java:73)
        at com.google.inject.Guice.createInjector(Guice.java:62)
        at com.datastax.bdp.ioc.DseInjector.get(DseInjector.java:36)
        at com.datastax.bdp.DseModule.main(DseModule.java:103)
Caused by: com.fasterxml.jackson.databind.exc.MismatchedInputException: No content to map due to end-of-input
 at [Source: (BufferedInputStream); line: -1, column: 0]

Now, let’s talk about the fix that I have applied to get rid of the problem, but before that I recalled that the database was abruptly shutdown last time due to server/hsot crashed. This file contains local startup information like data center, release info, scheam version, native transport address, port info (JMX, storage etc.), broadcast address etc.


[cassandra@fatdba-doccass bin]$ cd /var/lib/cassandra/metadata/nodes/
[cassandra@fatdba-doccass nodes]$ ls
local  peers  snapshots
[cassandra@fatdba-doccass nodes]$ rm -rf local
[cassandra@fatdba-doccass nodes]$
[cassandra@fatdba-doccass nodes]$
[cassandra@fatdba-doccass nodes]$ ls -ltrh
total 4.0K
drwxrwxr-x. 2 cassandra cassandra 6 Jul 27 16:07 snapshots
-rw-rw-r--. 1 cassandra cassandra 1 Aug  4 18:29 peers

Now let’s try to start the node once again.


[cassandra@fatdba-doccass bin]$
[cassandra@fatdba-doccass bin]$ ./dse cassandra

And it got started after clearning the local details/information which was there when node was crashed.

Hope It Helps
Prashant Dixit

Posted in Advanced, troubleshooting | Tagged: cassandra, troubleshooting | Leave a Comment »

How to tune the IO contentions related with the Compaction in Cassandra ?

Posted by FatDBA on August 20, 2017

Hi Fellas,
Back and this time with some performance tuning scopes for Cassandra DB during the ‘Compaction’ process.
Before i proceed, would like to explain a bit about the compaction in Cassandra and what exactly is this and why a necessary evil …

Compaction in Cassandra refers to the operation of merging multiple SSTables into a single new one. Typically, compaction is done in a database for two primary reasons:

– To reduce the storage usage.
– To improve read performance by merging keys and obtaining a consolidated index.

For example, in Apache Cassandra, data files are merged periodically to form compacted SSTables.

There is a good chance of contention happening in database due to Compaction activity as the Compaction increases I/O contention on SSTable data read. Writing data in Cassandra database is generally fast and the write impacts may not be seen but reading data from SSTables will be slow in case when I/O contention increases due to compaction activities and degrades the performance of the database.

First would like to discuss how to identify the compaction related contentions on the database.
– We can use the “nodetool tablestats” or the old “nodetool cfstats” command to
monitor or watch-keep SSTables.
Below is a sample result from one of the Cassandra database server, here we need to check
– Check if the count is keep on growing, because that points out that there may be contention between reading SST
and the compaction process.
– Read generally slows down due to an obvious reason of data distributed or fragmented across many SSTs and
Compaction running continuous in the background.

%nodetool tablestats -H dixit.playlist
Keyspace: dixit
Read Count: 182849
Read Latency: 0.11363755339104945 ms.
Write Count: 435355
Write Latency: 0.01956930550929701 ms.
Pending Flushes: 0
Table: standard1
SSTable count: 2
Space used (live): 51.62 MB
Space used (total): 51.62 MB
Space used by snapshots (total): 0 bytes
Off heap memory used (total): 302.36 KB
SSTable Compression Ratio: 0.0
Number of keys (estimate): 376390
Memtable cell count: 200120
Memtable data size: 45.16 MB
Memtable off heap memory used: 0 bytes
Memtable switch count: 2
Local read count: 182849
Local read latency: 0.125 ms
Local write count: 435355
Local write latency: 0.022 ms
Pending flushes: 0
Bloom filter false positives: 11
Bloom filter false ratio: 0.00000
Bloom filter space used: 265.81 KB
Bloom filter off heap memory used: 265.8 KB
Index summary off heap memory used: 36.57 KB
Compression metadata off heap memory used: 0 bytes
Compacted partition minimum bytes: 216 bytes
Compacted partition maximum bytes: 258 bytes
Compacted partition mean bytes: 258 bytes
Average live cells per slice (last five minutes): 1.0
Maximum live cells per slice (last five minutes): 1
Average tombstones per slice (last five minutes): 1.0
Maximum tombstones per slice (last five minutes): 1

Below is the command that can be used to check for compaction statistics, here you need to look at the ‘pending tasks’, and ‘bytes total in progress’.

$ nodetool compactionstats
pending tasks: 5
compaction type keyspace table completed total unit progress
Compaction Keyspace1 Standard1 282310680 302170540 bytes 93.43%
Compaction Keyspace1 Standard1 58457931 307520780 bytes 19.01%
Active compaction remaining time : 0h00m16s

Solution to the problem
1. First one is quite simple – Avoid merging of update/delete requests.
2. Reduce the frequency of in-memory objects (In Memtables) flush.

This can be done by increasing the size of the memtables to avoid or stop database to perform frequent flushes.
– Less number of flushes leads to fewer SSTs compaction.
– Less Compaction reduces the I/Contentions and this in turn improve reads.
– There are couple of parameters that you can adjust in your cassandra.yaml file to control the flushing.
i.e. memtable_flush_after_mins, memtable_throughput_in_mb , memtable_operations_in_millions.

3. One more solution but that only applies on systems where this stress in IO is not much frequent, we can reduce
the “thread priority” which reduces the IOs.
As lowering the priority slows down the compaction writes but only applies if it doesn’t happen frequently.

Add below lines in cassandra-env.conf file (Under /conf folder) to lower the compaction priority.

JVM_OPTS=”$JVM_OPTS -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Dcassandra.compaction.priority=1″

One last line would like to add.
In case when the IO is a genuine problem, you will need to add more nodes or replace disks with better performing one’s or high IO disks.

Hope It Helps
Prashant Dixit

Posted in Advanced | Tagged: cassandra, performance, Tuning | Leave a Comment »

Installing Cassandra Cluster Manager (CCM) on Oracle Linux 7

Posted by FatDBA on July 20, 2017

Hi All,
Today going to discuss about the CCM or the Cassandra Cluster Manager, which is basically a tool that we can use to create a multi-node cluster of Cassandra database on a local machine. This can be easily used to mimic the production like clustering setup for Cassandra on a local machine. This will help you to understand how clustering works in case of Cassandra databases.

Below i am going to show how to create a 3 node Cassandra cluster on the top of OEL7 with Cassandra version 3.11.0

Step 1:
First download the the PIP and then install it along with PyYAML packages.
Download the ‘PIP’, ‘Wheel’ and ‘Python Setuptools’, follow the link
https://packaging.python.org/tutorials/installing-packages/#install-pip-setuptools-and-wheel

[root@fatdba ~]# ls -ltrh
total 168M
-rw-r–r–. 1 root root 163M Mar 16 01:35 jdk-8u131-linux-x64.rpm
-rw——-. 1 root root 1.4K Jun 17 12:59 anaconda-ks.cfg
-rw-r–r–. 1 root root 1.5K Jun 17 13:34 initial-setup-ks.cfg
-rw-r–r–. 1 root root 4.2M Jun 17 17:01 master.zip
drwxr-xr-x. 2 root root 6 Jul 17 11:08 Templates
drwxr-xr-x. 2 root root 6 Jul 17 11:08 Public
drwxr-xr-x. 2 root root 6 Jul 17 11:08 Downloads
drwxr-xr-x. 2 root root 6 Jul 17 11:08 Desktop
drwxr-xr-x. 2 root root 6 Jul 17 11:08 Videos
drwxr-xr-x. 2 root root 6 Jul 17 11:08 Pictures
drwxr-xr-x. 2 root root 6 Jul 17 11:08 Music
drwxr-xr-x. 2 root root 6 Jul 17 11:08 Documents
-rw-r–r–. 1 root root 1.6M Jul 17 13:44 get-pip.py

[root@fatdba ~]# python get-pip.py
Collecting pip
Downloading pip-9.0.1-py2.py3-none-any.whl (1.3MB)
100% ████████████████████████████████ 1.3MB 51kB/s
Collecting wheel
Downloading wheel-0.29.0-py2.py3-none-any.whl (66kB)
100% ████████████████████████████████ 71kB 430kB/s
Installing collected packages: pip, wheel
Successfully installed pip-9.0.1 wheel-0.29.0
[root@fatdba ~]#
[root@fatdba ~]#
[root@fatdba ~]# which pip
/usr/bin/pip
[root@fatdba ~]#

[root@fatdba ~]# pip install cql PyYAML
Collecting cql
Downloading cql-1.4.0.tar.gz (76kB)
100% ████████████████████████████████ 81kB 252kB/s
Collecting PyYAML
Downloading PyYAML-3.12.tar.gz (253kB)
100% ████████████████████████████████ 256kB 308kB/s
Collecting thrift (from cql)
Downloading thrift-0.10.0.zip (87kB)
100% ████████████████████████████████ 92kB 568kB/s
Requirement already satisfied: six>=1.7.2 in /usr/lib/python2.7/site-packages (from thrift->cql)
Building wheels for collected packages: cql, PyYAML, thrift
Running setup.py bdist_wheel for cql … done
Stored in directory: /root/.cache/pip/wheels/e6/b3/50/fdb7532df6817694ae467c7aaedb991c2104b463ab31f7a94f
Running setup.py bdist_wheel for PyYAML … done
Stored in directory: /root/.cache/pip/wheels/2c/f7/79/13f3a12cd723892437c0cfbde1230ab4d82947ff7b3839a4fc
Running setup.py bdist_wheel for thrift … done
Stored in directory: /root/.cache/pip/wheels/e7/f1/d3/b472914d95caa1781fb29b1257b85808324b0bfd1838961752
Successfully built cql PyYAML thrift
Installing collected packages: thrift, cql, PyYAML
Successfully installed PyYAML-3.12 cql-1.4.0 thrift-0.10.0

Step 2: Now using the PIP, install the CCM package.

[root@fatdba ~]# pip install ccm
Collecting ccm
Downloading ccm-2.7.0.tar.gz (68kB)
100% ████████████████████████████████ 71kB 186kB/s
Requirement already satisfied: pyYaml in /usr/lib64/python2.7/site-packages (from ccm)
Requirement already satisfied: six>=1.4.1 in /usr/lib/python2.7/site-packages (from ccm)
Building wheels for collected packages: ccm
Running setup.py bdist_wheel for ccm … done
Stored in directory: /root/.cache/pip/wheels/9d/ec/85/e971d86de3002809194d0c4bb7ee72f9fab55b428c8293cd79
Successfully built ccm
Installing collected packages: ccm
Successfully installed ccm-2.7.0
[root@fatdba ~]#

Step 3: Make required entries in your /etc/hosts file.

bash-4.2$ more /etc/hosts
#127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.40.131 fatdba.localdomain fatdba

#Cassandra Nodes for CCM
127.0.0.1 127.0.0.2
127.0.0.1 127.0.0.3
127.0.0.1 127.0.0.4

Step 4: Now, lets create the cluster using the CCM.
I will be creating this cluster with name ‘dixit’ with 3 nodes available.

-bash-4.2$ ccm create dixit -v 3.11.0
████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████
-bash-4.2$ ccm status
Cluster: ‘dixit’
—————-
No node in this cluster yet
-bash-4.2$
-bash-4.2$ ccm populate -n 3

-bash-4.2$ ccm status
Cluster: ‘dixit’
—————-
node1: DOWN (Not initialized) node3: DOWN (Not initialized) node2: DOWN (Not initialized)

Lets start the cluster now when all the nodes are successfully added.
Just to make things little easier and more understandable, i will be starting each node one-by-one which otherwise can be enabled in a single command or in one go.

-bash-4.2$ ccm node1 start
-bash-4.2$ ccm node2 start
-bash-4.2$ ccm node3 start
-bash-4.2$
-bash-4.2$ ccm status
Cluster: 'dixit' ---------------- node1: UP node3: UP node2: UP

Step 5: Verify the cluster status.

bash-4.2$ ccm liveset
127.0.0.1,127.0.0.3,127.0.0.2

bash-4.2$ ccm cqlsh node1
Unknown node or command: cqlsh

bash-4.2$ ccm node1 cqlsh
Connected to dixit at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.0 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh>
cqlsh>

bash-4.2$
bash-4.2$ ccm node1 show
node1: UP
cluster=dixit
auto_bootstrap=False
thrift=(‘127.0.0.1’, 9160)
binary=(‘127.0.0.1’, 9042)
storage=(‘127.0.0.1’, 7000)
jmx_port=7100
remote_debug_port=0
byteman_port=0
initial_token=-9223372036854775808
pid=16852
bash-4.2$
bash-4.2$
bash-4.2$ ccm node2 show
node2: UP
cluster=dixit
auto_bootstrap=False
thrift=(‘127.0.0.2’, 9160)
binary=(‘127.0.0.2’, 9042)
storage=(‘127.0.0.2’, 7000)
jmx_port=7200
remote_debug_port=0
byteman_port=0
initial_token=-3074457345618258603
pid=16947
bash-4.2$
bash-4.2$
bash-4.2$ ccm node3 show
node3: UP
cluster=dixit
auto_bootstrap=False
thrift=(‘127.0.0.3’, 9160)
binary=(‘127.0.0.3’, 9042)
storage=(‘127.0.0.3’, 7000)
jmx_port=7300
remote_debug_port=0
byteman_port=0
initial_token=3074457345618258602
pid=17191
bash-4.2$

Some additional monitoring of Java processes and heap usage etc. can be done using Java Management Console.

Hope That Helps
Prashant Dixit

Posted in Advanced | Tagged: cassandra | Leave a Comment »

Cassandra NodeTool Utility

Posted by FatDBA on July 14, 2017

The nodetool utility gives a easy CLI to perform some of the admin activities and configure the database.
Today i would like to share few of the command/operations that i have tried and tested by my own, below mentioned are few of the commands with its syntax and usage details.

Let’s first explore all possible options or attributes of this utility.


Starting NodeTool

Missing required option: h

usage: java org.apache.cassandra.tools.NodeCmd --host  
 -h,--host         node hostname or ip address

 -p,--port         remote jmx agent port number

 -pw,--password    remote jmx agent password

 -u,--username     remote jmx agent username
Available commands:

  ring                   - Print informations on the token ring

  join                   - Join the ring

  info                   - Print node informations (uptime, load, ...)

  cfstats                - Print statistics on column families

  clearsnapshot          - Remove all existing snapshots

  version                - Print cassandra version

  tpstats                - Print usage statistics of thread pools

  drain                  - Drain the node (stop accepting writes and flush all column families)

  decommission           - Decommission the node

  loadbalance            - Loadbalance the node

  compactionstats        - Print statistics on compactions

  disablegossip          - Disable gossip (effectively marking the node dead)

  enablegossip           - Reenable gossip

  disablethrift          - Disable thrift server

  enablethrift           - Reenable thrift server

  snapshot [snapshotname] - Take a snapshot using optional name snapshotname

  netstats [host]        - Print network information on provided host (connecting node by default)

  move        - Move node on the token ring to a new token

  removetoken status|force| - Show status of current token removal, force completion of pending removal or remove provided token

  flush [keyspace] [cfnames] - Flush one or more column family

  repair [keyspace] [cfnames] - Repair one or more column family

  cleanup [keyspace] [cfnames] - Run cleanup on one or more column family

  compact [keyspace] [cfnames] - Force a (major) compaction on one or more column family

  scrub [keyspace] [cfnames] - Scrub (rebuild sstables for) one or more column family

  invalidatekeycache [keyspace] [cfnames] - Invalidate the key cache of one or more column family

  invalidaterowcache [keyspace] [cfnames] - Invalidate the key cache of one or more column family

  getcompactionthreshold   - Print min and max compaction thresholds for a given column family

  cfhistograms   - Print statistic histograms for a given column family

  setcachecapacity     - Set the key and row cache capacities of a given column family

  setcompactionthreshold     - Set the min and max compaction thresholds for a given column family

Provides a histogram of network statistics at the time you fired this command.


bash-4.2$ nodetool proxyhistograms

proxy histograms

Percentile       Read Latency      Write Latency      Range Latency   CAS Read Latency  CAS Write Latency View Write Latency

                     (micros)           (micros)           (micros)           (micros)           (micros)           (micros)

50%                      0.00               0.00               0.00               0.00               0.00               0.00

75%                      0.00               0.00               0.00               0.00               0.00               0.00

95%                      0.00               0.00               0.00               0.00               0.00               0.00

98%                      0.00               0.00               0.00               0.00               0.00               0.00

99%                      0.00               0.00               0.00               0.00               0.00               0.00

Min                      0.00               0.00               0.00               0.00               0.00               0.00

Max                      0.00               0.00               0.00               0.00               0.00               0.00

Note: I haven’t done any activity on the database, so obvious we getting 0 for all the values or sections.

To do a sequential repair of all keyspaces on the current node:
bash-4.2$ nodetool repair -seq

Describe the cluster details.


bash-4.2$ nodetool describecluster

Cluster Information:

        Name: Test Cluster

        Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch

        Partitioner: org.apache.cassandra.dht.Murmur3Partitioner

        Schema versions:

                1852b5d8-f9ba-3549-b4b7-eaae1da39062: [127.0.0.1]

Status of the node.


bash-4.2$ nodetool status

Datacenter: datacenter1

=======================

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack

UN  127.0.0.1  190.25 KiB  256          100.0%            0277aea4-d06c-4175-8d57-6100101f0491  rack1

History of database comp actions done in the DB.


bash-4.2$ nodetool compactionhistory

Compaction History:

id                                   keyspace_name columnfamily_name compacted_at            bytes_in bytes_out rows_merged

39d4ff90-66df-11e7-ba43-41553ec85c87 system        size_estimates    2017-07-12T14:20:59.209 172588   42619     {4:4}

36186cc0-66df-11e7-ba43-41553ec85c87 system        sstable_activity  2017-07-12T14:20:52.664 475      82        {1:8, 4:1}

05558d20-66c6-11e7-ba43-41553ec85c87 system        size_estimates    2017-07-12T11:20:33.714 173036   43201     {4:4}

0424fc60-66c6-11e7-ba43-41553ec85c87 system        sstable_activity  2017-07-12T11:20:31.718 548      83        {1:12, 4:1}

20b362b0-660b-11e7-ba43-41553ec85c87 system        size_estimates    2017-07-11T13:02:43.739 166052   43228     {3:1, 4:3}

203ab040-660b-11e7-ba43-41553ec85c87 system        sstable_activity  2017-07-11T13:02:42.948 687      82        {1:28, 3:1}

62569400-65fa-11e7-ba43-41553ec85c87 system        local             2017-07-11T11:02:52.416 10157    5164      {4:1}

a1d34560-65f5-11e7-ba43-41553ec85c87 system_schema keyspaces         2017-07-11T10:28:51.446 668      277       {1:4, 2:2}

a1955200-65f5-11e7-ba43-41553ec85c87 system_schema tables            2017-07-11T10:28:51.040 5486     2689      {1:3, 2:2}

a0d906e0-65f5-11e7-ba43-41553ec85c87 system_schema columns           2017-07-11T10:28:49.806 10214    5654      {1:3, 2:2}

003788e0-65f2-11e7-ba43-41553ec85c87 system        local             2017-07-11T10:02:51.822 5358     5170      {4:1}

fd05d0f0-65f1-11e7-ba43-41553ec85c87 system        local             2017-07-11T10:02:46.463 5324     5199      {4:1}

fca0f4a0-65f1-11e7-ba43-41553ec85c87 system        local             2017-07-11T10:02:45.802 5346     5171      {4:1}

c604f720-6551-11e7-9add-f1b60320c550 system        local             2017-07-10T14:55:54.706 5166     5067      {4:1}

bd3430c0-6551-11e7-9add-f1b60320c550 system        local             2017-07-10T14:55:39.916 301      148       {4:1}

bb8e9710-6551-11e7-9add-f1b60320c550 system        local             2017-07-10T14:55:37.153 324      148       {4:1}

Statistics related to any ongoing compaction task, 0 if not any.

bash-4.2$ nodetool compactionstats
pending tasks: 0

Garbage collection statistics.


bash-4.2$ nodetool gcstats

       Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed (ms)   GC Reclaimed (MB)         Collections      Direct Memory Bytes

            36066339                9200               57107                2102          2612889352                  32                       -1

bash-4.2$

Log levels defined in database for all areas.

bash-4.2$ nodetool getlogginglevels

Logger Name Log Level
ROOT INFO
com.thinkaurelius.thrift ERROR
org.apache.cassandra DEBUG
bash-4.2$

Tracing probabilities currently set in DB.

bash-4.2$ nodetool gettraceprobability
Current trace probability: 0.0
bash-4.2$

Gossip protocol related statistics.


bash-4.2$ nodetool gossipinfo

localhost/127.0.0.1

  generation:1499747570

  heartbeat:36966

  STATUS:15:NORMAL,-1019516550404639999

  LOAD:36910:255305.0

  SCHEMA:1623:1852b5d8-f9ba-3549-b4b7-eaae1da39062

  DC:6:datacenter1

  RACK:8:rack1

  RELEASE_VERSION:4:3.11.0

  RPC_ADDRESS:3:127.0.0.1

  NET_VERSION:1:11

  HOST_ID:2:0277aea4-d06c-4175-8d57-6100101f0491

  RPC_READY:20:true

  TOKENS:14:

Provides network information about the host machine.


bash-4.2$ nodetool netstats

Mode: NORMAL

Not sending any streams.

Read Repair Statistics:

Attempted: 0

Mismatch (Blocking): 0

Mismatch (Background): 0

Pool Name                    Active   Pending      Completed   Dropped

Large messages                  n/a         0              0         0

Small messages                  n/a         0              4         0

Gossip messages                 n/a         0              0         0

bash-4.2$

Hope It Helps
Prashant Dixit

Posted in Basics | Tagged: cassandra | Leave a Comment »

Installing Cassandra 3.0 on OEL7 – Steps!

Posted by FatDBA on July 11, 2017

Hi All,
Though many of you knows and heard about Cassandra database, still i would first like to explain a little for one who not much familiar with the term and will follow with the steps on how to install Cassandra 3.0 on OEL7.

Apache Cassandra is a highly scalable, high-performance distributed database designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. It is a type of NoSQL database. Let us first understand what a NoSQL database does.

A NoSQL database (sometimes called as Not Only SQL) is a database that provides a mechanism to store and retrieve data other than the tabular relations used in relational databases. These databases are schema-free, support easy replication, have simple API, eventually consistent, and can handle huge amounts of data.

In this post, we will be installing Apache Cassandra on Oracle Enterprise Linux 7 server.

Step 1: Installing Java
Java is the main prerequisite for Cassandra.

[root@fatdba ~]# wget –no-cookies –no-check-certificate –header “Cookie:oraclelicense=accept-securebackup-cookie” “http://download.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.rpm”
–2017-07-10 14:31:35– http://download.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.rpm
Resolving download.oracle.com (download.oracle.com)… 96.17.182.42, 96.17.182.49
Connecting to download.oracle.com (download.oracle.com)|96.17.182.42|:80… connected.
HTTP request sent, awaiting response… 302 Moved Temporarily
Location: https://edelivery.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.rpm [following]
–2017-07-10 14:31:37– https://edelivery.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.rpm
Resolving edelivery.oracle.com (edelivery.oracle.com)… 104.108.215.179, 2600:1417:2c:190::2d3e, 2600:1417:2c:195::2d3e
Connecting to edelivery.oracle.com (edelivery.oracle.com)|104.108.215.179|:443… connected.
HTTP request sent, awaiting response… 302 Moved Temporarily
Location: http://download.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.rpm?AuthParam=1499677417_13d7c840391385e13fc7e785791cb8d6 [following]
–2017-07-10 14:31:37– http://download.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.rpm?AuthParam=1499677417_13d7c840391385e13fc7e785791cb8d6
Connecting to download.oracle.com (download.oracle.com)|96.17.182.42|:80… connected.
HTTP request sent, awaiting response… 200 OK
Length: 169983496 (162M) [application/x-redhat-package-manager]
Saving to: ‘jdk-8u131-linux-x64.rpm’

100%[==============================================================================================================================>] 169,983,496 1.13MB/s in 2m 27s

2017-07-10 14:34:04 (1.11 MB/s) – ‘jdk-8u131-linux-x64.rpm’ saved [169983496/169983496]

[root@fatdba ~]# yum -y localinstall jdk-8u131-linux-x64.rpm
Loaded plugins: langpacks, ulninfo
Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast
Examining jdk-8u131-linux-x64.rpm: 2000:jdk1.8.0_131-1.8.0_131-fcs.x86_64
Marking jdk-8u131-linux-x64.rpm to be installed
Resolving Dependencies
–> Running transaction check
—> Package jdk1.8.0_131.x86_64 2000:1.8.0_131-fcs will be installed
–> Finished Dependency Resolution
ol7_UEKR4/x86_64 | 1.2 kB 00:00:00
ol7_UEKR4/x86_64/updateinfo | 73 kB 00:00:00
ol7_UEKR4/x86_64/primary | 18 MB 00:00:17
ol7_addons/x86_64 | 1.2 kB 00:00:00
ol7_addons/x86_64/updateinfo | 38 kB 00:00:00
ol7_addons/x86_64/primary | 73 kB 00:00:00
ol7_latest/x86_64 | 1.4 kB 00:00:00
ol7_latest/x86_64/updateinfo | 1.3 MB 00:00:01
ol7_latest/x86_64/primary | 26 MB 00:00:26

Dependencies Resolved

========================================================================================================================================================================
Package Arch Version Repository Size
========================================================================================================================================================================
Installing:
jdk1.8.0_131 x86_64 2000:1.8.0_131-fcs /jdk-8u131-linux-x64 269 M

Transaction Summary
========================================================================================================================================================================
Install 1 Package

Total size: 269 M
Installed size: 269 M
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : 2000:jdk1.8.0_131-1.8.0_131-fcs.x86_64 1/1
Unpacking JAR files…
tools.jar…
plugin.jar…
javaws.jar…
deploy.jar…
rt.jar…
jsse.jar…
charsets.jar…
localedata.jar…
Verifying : 2000:jdk1.8.0_131-1.8.0_131-fcs.x86_64 1/1

Installed:
jdk1.8.0_131.x86_64 2000:1.8.0_131-fcs

Complete!

Verify the version of JAVA now after the installation.

[root@fatdba ~]# java -version
openjdk version “1.8.0_131”
OpenJDK Runtime Environment (build 1.8.0_131-b12)
OpenJDK 64-Bit Server VM (build 25.131-b12, mixed mode)

Step 2: Installing Cassandra

[root@fatdba etc]# yum -y install cassandra
Loaded plugins: langpacks, ulninfo
Resolving Dependencies
–> Running transaction check
—> Package cassandra.noarch 0:3.11.0-1 will be installed
–> Finished Dependency Resolution

Dependencies Resolved

========================================================================================================================================================================
Package Arch Version Repository Size
========================================================================================================================================================================
Installing:
cassandra noarch 3.11.0-1 cassandra 28 M

Transaction Summary
========================================================================================================================================================================
Install 1 Package

Total download size: 28 M
Installed size: 37 M
Downloading packages:
cassandra-3.11.0-1.noarch.rpm | 28 MB 00:02:25
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction

Installing : cassandra-3.11.0-1.noarch 1/1
Verifying : cassandra-3.11.0-1.noarch 1/1

Installed:
cassandra.noarch 0:3.11.0-1

Complete!
[root@fatdba etc]#
[root@fatdba etc]#

Reload your system daemons by running:

[root@fatdba etc]# systemctl daemon-reload

Start Cassandra by using following command.

[root@fatdba etc]# systemctl start cassandra

You can verify the status of Cassandra by using following command.

[root@fatdba default.conf]# nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
— Address Load Tokens Owns (effective) Host ID Rack
UN 127.0.0.1 175.32 KiB 256 100.0% 0277aea4-d06c-4175-8d57-6100101f0491 rack1

bash-4.2$ nodetool describecluster
Cluster Information:
Name: Test Cluster
Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
1852b5d8-f9ba-3549-b4b7-eaae1da39062: [127.0.0.1]

Other checks to conform Cassandra status.

[root@fatdba default.conf]# service cassandra sattus
Usage: cassandra start|stop|status|restart|reload
[root@fatdba default.conf]# service cassandra status
● cassandra.service – LSB: distributed storage system for structured data
Loaded: loaded (/etc/rc.d/init.d/cassandra; bad; vendor preset: disabled)
Active: active (running) since Tue 2017-07-11 10:02:03 IST; 1min 37s ago
Docs: man:systemd-sysv-generator(8)
Process: 28704 ExecStart=/etc/rc.d/init.d/cassandra start (code=exited, status=0/SUCCESS)
Main PID: 28795 (java)
Memory: 8.0K
CGroup: /system.slice/cassandra.service
‣ 28795 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-3.b12.el7_3.x86_64/jre/bin/java -Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities -XX:Threa…

Jul 11 10:01:36 fatdba.localdomain systemd[1]: Starting LSB: distributed storage system for structured data…
Jul 11 10:01:37 fatdba.localdomain su[28714]: (to cassandra) root on none
Jul 11 10:02:03 fatdba.localdomain systemd[1]: Started LSB: distributed storage system for structured data.
Jul 11 10:02:03 fatdba.localdomain cassandra[28704]: Starting Cassandra: OK

[root@fatdba etc]#

[root@fatdba etc]# ps -ef|grep cassandra
cassand+ 28795 1 12 10:02 ? 00:00:01 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-3.b12.el7_3.x86_64/jre/bin/java -Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003 -XX:+AlwaysPreTouch -XX:-UseBiasedLocking -XX:+UseTLAB -XX:+ResizeTLAB -XX:+UseNUMA -XX:+PerfDisableSharedMem -Djava.net.preferIPv4Stack=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSWaitDuration=10000 -XX:+CMSParallelInitialMarkEnabled -XX:+CMSEdenChunksRecordAlways -XX:+CMSClassUnloadingEnabled -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M -Xms1024M -Xmx1024M -Xmn100M -XX:CompileCommandFile=/etc/cassandra/conf/hotspot_compiler -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar -Dcassandra.jmx.local.port=7199 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.password.file=/etc/cassandra/jmxremote.password -Djava.library.path=/usr/share/cassandra/lib/sigar-bin -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid -cp /etc/cassandra/conf:/usr/share/cassandra/lib/airline-0.6.jar:/usr/share/cassandra/lib/antlr-runtime-3.5.2.jar:/usr/share/cassandra/lib/asm-5.0.4.jar:/usr/share/cassandra/lib/caffeine-2.2.6.jar:/usr/share/cassandra/lib/cassandra-driver-core-3.0.1-shaded.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.9.jar:/usr/share/cassandra/lib/commons-lang3-3.1.jar:/usr/share/cassandra/lib/commons-math3-3.2.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/share/cassandra/lib/concurrent-trees-2.4.0.jar:/usr/share/cassandra/lib/disruptor-3.0.1.jar:/usr/share/cassandra/lib/ecj-4.4.2.jar:/usr/share/cassandra/lib/guava-18.0.jar:/usr/share/cassandra/lib/HdrHistogram-2.1.9.jar:/usr/share/cassandra/lib/high-scale-lib-1.0.6.jar:/usr/share/cassandra/lib/hppc-0.5.4.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.3.0.jar:/usr/share/cassandra/lib/javax.inject.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jcl-over-slf4j-1.7.7.jar:/usr/share/cassandra/lib/jctools-core-1.2.1.jar:/usr/share cassandra/lib/jflex-1.6.0.jar:/usr/share/cassandra/lib/jna-4.4.0.jar:/usr/share/cassandra/lib/joda-time-2.4.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share cassandra/lib/jstackjunit-0.0.1.jar:/usr/share/cassandra/lib/libthrift-0.9.2.jar:/usr/share/cassandra/lib/log4j-over-slf4j-1.7.7.jar:/usr/share/cassandra/lib/logback-classic-1.1.3.jar:/usr/share/cassandra/lib/logback-core-1.1.3.jar:/usr/share/cassandra/lib/lz4-1.3.0.jar:/usr/share/cassandra/lib/metrics-core-3.1.0.jar:/usr/share/cassandra/lib/metrics-jvm-3.1.0.jar:/usr/share/cassandra/lib/metrics-logback-3.1.0.jar:/usr/share/cassandra/lib/netty-all-4.0.44.Final.jar:/usr/share/cassandra/lib/ohc-core-0.4.4.jar:/usr/share/cassandra/lib/ohc-core-j8-0.4.4.jar:/usr/share/cassandra/lib/reporter-config3-3.0.3.jar:/usr/share/cassandra/lib/reporter-config-base-3.0.3.jar:/usr/share/cassandra/lib/sigar-1.6.4.jar:/usr/share/cassandra/lib/slf4j-api-1.7.7.jar:/usr/share/cassandra/lib/snakeyaml-1.11.jar:/usr/share/cassandra/lib/snappy-java-1.1.1.7.jar:/usr/share/cassandra/lib/snowball-stemmer-1.3.0.581.1.jar:/usr/share/cassandra/lib/ST4-4.0.8.jar:/usr/share/cassandra/lib/stream-2.5.2.jar:/usr/share/cassandra/lib/thrift-server-0.3.7.jar:/usr/share/cassandra/apache-cassandra-3.11.0.jar:/usr/share/cassandra/apache-cassandra-thrift-3.11.0.jar:/usr/share/cassandra/stress.jar: org.apache.cassandra.service.CassandraDaemon
root 28828 25570 0 10:02 pts/1 00:00:00 grep –color=auto cassandra

Cassandra has an inbuilt command line shell cqlsh to run query on the database/cluster. CQL is Cassandra Query language. To connect with the CQL CMD, run following command.

[root@fatdba default.conf]# cqlsh
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.0 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh>
cqlsh>
cqlsh>

Lets do some tests on the command line.
– Let’s check the entire list of commands that exists for Cassandra terminal.

cqlsh> help

Documented shell commands:
===========================
CAPTURE CLS COPY DESCRIBE EXPAND LOGIN SERIAL SOURCE UNICODE
CLEAR CONSISTENCY DESC EXIT HELP PAGING SHOW TRACING

CQL help topics: ================ AGGREGATES CREATE_KEYSPACE DROP_TRIGGER TEXT ALTER_KEYSPACE CREATE_MATERIALIZED_VIEW DROP_TYPE TIME ALTER_MATERIALIZED_VIEW CREATE_ROLE DROP_USER TIMESTAMP ALTER_TABLE CREATE_TABLE FUNCTIONS TRUNCATE ALTER_TYPE CREATE_TRIGGER GRANT TYPES ALTER_USER CREATE_TYPE INSERT UPDATE APPLY CREATE_USER INSERT_JSON USE ASCII DATE INT UUID BATCH DELETE JSON BEGIN DROP_AGGREGATE KEYWORDS BLOB DROP_COLUMNFAMILY LIST_PERMISSIONS BOOLEAN DROP_FUNCTION LIST_ROLES COUNTER DROP_INDEX LIST_USERS CREATE_AGGREGATE DROP_KEYSPACE PERMISSIONS CREATE_COLUMNFAMILY DROP_MATERIALIZED_VIEW REVOKE CREATE_FUNCTION DROP_ROLE SELECT CREATE_INDEX DROP_TABLE SELECT_JSON

– Let me check the CLUSTER details and CONSISTENCY level (Default).

cqlsh> CONSISTENCY
Current consistency level is ONE.
cqlsh> describe cluster;

Cluster: Test Cluster
Partitioner: Murmur3Partitioner

– Now i am going to create a table.

cqlsh> CREATE TABLE prashant.dixit_alt_stats ( id UUID PRIMARY KEY, lastname text, birthday timestamp, nationality text, weight text, height text );
ConfigurationException: Keyspace prashant doesn’t exist

Oops an error, this reads that the keyspace which i used doesn’t exists, let me quickly create one.
cqlsh> create keyspace prashant
… WITH replication = {‘class’:’SimpleStrategy’, ‘replication_factor’ : 3};

Let’s try to create the table once again.
cqlsh> CREATE TABLE prashant.dixit_alt_stats ( id UUID PRIMARY KEY, lastname text, birthday timestamp, nationality text, weight text, height text );

Done!

Now, lets query the newly created object and send the result to a file at any of the location.
For that we have the CAPTURE option, This command captures the output of a command and adds it to a file.

cqlsh> CAPTURE ‘/tmp/outputfile’
Now capturing query output to ‘/tmp/outputfile’.

[root@fatdba tmp]# more outputfile

id | birthday | height | lastname | nationality | weight
—-+———-+——–+———-+————-+——–

(0 rows)

Hope It Helps
Prashant Dixit

Posted in Basics | Tagged: cassandra | Leave a Comment »

Tales From A Lazy Fat DBA

Its all about Databases, their performance, troubleshooting & much more …. ¯\_(ツ)_/¯

Likes

Posts Tagged ‘cassandra’

Real Time Data Replication from Oracle to DataStax Cassandra using Oracle GoldenGate 23ai DAA

Datastax Certified Cassandra Administrator, some tips & more

Passed ‘Datastax Apache Cassandra 3.x Administrator Associate’ Certification

Cassandra node refused to start – MismatchedInputException: No content to map due to end-of-input

How to tune the IO contentions related with the Compaction in Cassandra ?

Installing Cassandra Cluster Manager (CCM) on Oracle Linux 7

Cassandra NodeTool Utility

Installing Cassandra 3.0 on OEL7 – Steps!

Its all about Databases, their performance, troubleshooting & much more …. ¯\_(ツ)_/¯

Likes

Posts Tagged ‘cassandra’

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: