Interesting problem I had this week that's obvious when you think about it.
The Config
I have a cluster of 5 Cassandra (version 2.0.11.83) nodes. I execute:
DROP KEYSPACE IF EXISTS my_keyspace
CREATE KEYSPACE my_keyspace WITH REPLICATION = {
'class' : 'SimpleStrategy',
'replication_factor' : 2
}
CREATE TABLE my_table (
my_int_id varint,
my_string_id varchar,
my_value decimal,
PRIMARY KEY (my_int_id, my_string_id)
)
I then start hammering my cluster with some CQL like:
INSERT INTO my_table
(my_int_id, my_string_id, my_value)
VALUES (1, "2", 42.0)
or such like. The code I use to write to the DB looks something like:
RegularStatement toPrepare = (RegularStatement) (new SimpleStatement(cql).setConsistencyLevel(ConsistencyLevel.TWO));
PreparedStatement preparedStatement = cassandraSession.prepare(toPrepare);
.
.
BoundStatement bounded = preparedStatement.bind(new BigInteger(myIntId), myStringId, new BigDecimal("42.0"));
cassandraSession.execute(bounded);
The Error
I then kill a single node and my client starts barfing with:
Not enough replica available for query at consistency TWO (2 required but only 1 alive)
which baffled me initially because there were still 4 nodes in the cluster:
dse$ bin/nodetool -h 192.168.1.15 status my_keyspace
Datacenter: Analytics
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 192.168.1.15 83.98 KB 256 40.0% 86df3d65-0e19-49b2-a49e-bc89d6347fa4 rack1
UN 192.168.1.16 82.78 KB 256 42.0% 8fghe485-0e19-49b2-a49e-bc89d6347fa5 rack1
UN 192.168.1.17 84.91 KB 256 38.0% 9834edf1-0e19-49b2-a49e-bc89d6347fa6 rack1
UN 192.168.1.18 88.22 KB 256 41.0% 873fdeab-0e19-49b2-a49e-bc89d6347fa7 rack1
DN 192.168.1.19 86.36 KB 256 39.0% 2bdf9211-0e19-49b2-a49e-bc89d6347fa8 rack1
Of course, the problem is that your tolerance to a node fault is:
Tolerance = Replication Factor - Consistency Level
For me, that was Tolerance = 2 - 2 = 0 tolerance. Oops.
This is not a problem on all inserts. Running
dse$ bin/cqlsh -uUSER - pPASSWORD
Connected to PhillsCassandraCluster at 192.168.1.15:9160.
[cqlsh 4.1.1 | Cassandra 2.0.11.83 | DSE 4.6.0 | CQL spec 3.1.1 | Thrift protocol 19.39.0]
Use HELP for help.
and inserting worked fine. But this is because cqlsh uses a different consistency level.
cqlsh> consistency
Current consistency level is ONE.
so, let's change that:
cqlsh> consistency TWO
Consistency level set to TWO.
The Problem
Cassandra will try to put a piece of data on REPLICATION_FACTOR number of nodes. These nodes will be determined by hashing their primary key. If a node is down, Cassandra will not choose another. It simply cannot be written to at this time (of course, it might get the data if/when the node joins the cluster).
Upon inserting, the client decides how many nodes must be successfully written to before the INSERT is declared a success. This number is given by your CONSISTENCY_LEVEL. If the contract cannot be fulfilled then Cassandra will barf.
Further Reading
Application Failure Scenarios with Cassandra
Lightweight Transactions in Cassandra 2.0