Replication Factor

Whenever we start Cassandra, we start a cluster, which is a collection of 1 or more than one nodes. A cluster can be splitted onto more than one datacenter. Cassandra’s architecture was that the hardware failure can occur at any time. Any node can be down. Even entire datacenter can go down so it is better to add more than one data centers in your cluster. Cassandra stores data replicas on multiple nodes to ensure reliability and fault tolerance. The total number of replicas for a keyspace across a Cassandra cluster is referred to as the keyspace’s replication factor. A replication factor of one means that there is only one copy of each row in the Cassandra cluster. A replication factor of two means there are two copies of each row, where each copy is on a different node. All replicas are equally important; there is no primary or master replica.

In a production system, it is advisable to have three or more Cassandra nodes in each data center, and the default replication factor should be three. As a general rule, the replication factor should not exceed the number of Cassandra nodes in the cluster. If there are 3 nodes in your cluster then how can you store data on more than three nodes?

If you add additional Cassandra nodes to the cluster, the default replication factor is not affected.

For example, if you increase the number of Cassandra nodes to six, but leave the replication factor at three, that means data will be replicated on just three nodes not on all six nodes. If you are adding of deleting nodes in a cluster, it is your responsibility to change the replication factor. If a node goes down, a higher replication factor means a higher probability that the data on the node exists on one of the remaining nodes. The downside of a higher replication factor is an increased latency on data writes.

All the nodes exchange information with each other using Gossip protocol. Gossip is a protocol in Cassandra by which nodes can communicate with each other.

There are two kinds of replication strategies in Cassandra. Which are mentioned here.

Consistency Level

When you write any data in a cluster which has four nodes and replication factor three, data is stored on three nodes. This replication of data on three nodes is quick but not instant. When you write a data and instantly read the same data, it is possible that data is not yet completly replicated and some of the nodes are returning old data. Then how Cassandra enforces consistency? Simple answer is - Using consistency level.

The Cassandra consistency level is defined as the minimum number of Cassandra nodes that must acknowledge a read or write operation before the operation can be considered successful. Different consistency levels can be assigned to different Edge keyspaces.

This table describes various types of write/read consistency levels.

When connecting to Cassandra for read and write operations, generally used consistency level value is LOCAL_QUORUM. However, some keyspaces are defined to use a consistency level of one.

The calculation of the value of LOCAL_QUORUM for a data center is:

LOCAL_QUORUM = (replication_factor/2) + 1 

As described above, the default replication factor for an Edge production environment with three Cassandra nodes is three. Therefore, the default value of LOCAL_QUORUM = (32) +1 = 2 (the value is rounded down to an integer).

With LOCAL_QUORUM = 2, at least two of the three Cassandra nodes in the data center must respond to a read/write operation for the operation to succeed. For a three node Cassandra cluster, the cluster could therefore tolerate one node being down per data center.

By specifying the consistency level as LOCAL_QUORUM, Edge avoids the latency required by validating operations across multiple data centers. If a keyspace used the Cassandra QUORUM value as the consistency level, read/write operations would have to be validated across all data centers.

If you add additional Cassandra nodes to the cluster, the consistency level is not affected. Consistency level can be different for read and write queries however it is advised to use same consistency level for read and write.