Im putting together a 16 node cassandra cluster (replication factor 2) and want to setup a schedule for nodetool repair
. gc_grace_seconds is at the default.
Two questions:
- My first impulse is to setup a cron job for each machine and attempt to manually randomize the timing around a one week schedule. Is there a better way?
- Does
nodetool repair
have to be run on every system or every # systems/replication factor systems? (IE for my 16 nodes with replication factor 2 – 8 systems – one of each pair)
Best Answer
I would not randomize it. Your best bet is to schedule the repairs so they don't stomp on each other.
You should use the -pr option on each node when running repair.
If you're using Cassandra 2.1 you have the option for incremental repair which will speed things up considerably.
RF=2 is also a recipe for disaster.. quorum queries will fail if a node is unavailable. I recommend RF=3.