A deep dive into how data is stored on Cassandra rings and how to use this knowledge to give your operators notice before cluster issues become site issues.
Any feedback is super welcome. If folks like this we may consider working on upstreaming the approach into nodetool.
This is awesome. We're rolling out a bunch of production apps using C* and having some difficulty in monitoring all the right things without Ops Center. We're getting there, but this looks like a shortcut to some stuff we put off as too difficult to do immediately.
To be honest most of my exposure to RethinkDB is from Aphyr's article on it. We already run basically every alternative he mentions as superior for particular use cases. For example we run Zookeeper / replicated SQL for inter-key consistent actions and Cassandra for an AP store. When we need document semantics we have a pretty robust Elasticsearch setup. It didn't seem like RethinkDB would do all of those things better than those special purpose databases so I didn't really look into it too much.
Replacing Cassandra at Yelp would be a lot of effort, so we'd have to be sure that it's worth it. That being said, RethinkDB definitely looks interesting and I'll make sure it's on my list of datastores to evaluate.
We've had a few internal discussions about ScyllaDB. I think that the performance numbers look attractive, but we are concerned about maintainability. In particular we've invested fairly heavily in configuration management, tooling, monitoring, etc ... and Apache Cassandra seems to work pretty well for us.
We might try it out someday, but for now we're fairly happy with stock Cassandra.
Any feedback is super welcome. If folks like this we may consider working on upstreaming the approach into nodetool.