Cassandra : Replication Factor & Consistency Level

In a production system, it is advisable to have three or more Cassandra nodes in each data center, and the default replication factor should be three. As a general rule, the replication factor should not exceed the number of Cassandra nodes in the cluster. If there are 3 nodes in your cluster then how can you store data on more than three nodes?

If you add additional Cassandra nodes to the cluster, the default replication factor is not affected.

For example, if you increase the number of Cassandra nodes to six, but leave the replication factor at three, that means data will be replicated on just three nodes not on all six nodes. If you are adding of deleting nodes in a cluster, it is your responsibility to change the replication factor. If a node goes down, a higher replication factor means a higher probability that the data on the node exists on one of the remaining nodes. The downside of a higher replication factor is an increased latency on data writes.

All the nodes exchange information with each other using Gossip protocol. Gossip is a protocol in Cassandra by which nodes can communicate with each other.

There are two kinds of replication strategies in Cassandra. Which are mentioned here.

The Cassandra consistency level is defined as the minimum number of Cassandra nodes that must acknowledge a read or write operation before the operation can be considered successful. Different consistency levels can be assigned to different Edge keyspaces.

This table describes various types of write/read consistency levels.

When connecting to Cassandra for read and write operations, generally used consistency level value is LOCAL_QUORUM. However, some keyspaces are defined to use a consistency level of one.

The calculation of the value of LOCAL_QUORUM for a data center is:

LOCAL_QUORUM = (replication_factor/2) + 1

As described above, the default replication factor for an Edge production environment with three Cassandra nodes is three. Therefore, the default value of LOCAL_QUORUM = (3/2) +1 = 2 (the value is rounded down to an integer).

With LOCAL_QUORUM = 2, at least two of the three Cassandra nodes in the data center must respond to a read/write operation for the operation to succeed. For a three node Cassandra cluster, the cluster could therefore tolerate one node being down per data center.

By specifying the consistency level as LOCAL_QUORUM, Edge avoids the latency required by validating operations across multiple data centers. If a keyspace used the Cassandra QUORUM value as the consistency level, read/write operations would have to be validated across all data centers.

If you add additional Cassandra nodes to the cluster, the consistency level is not affected. Consistency level can be different for read and write queries however it is advised to use same consistency level for read and write.