Implementing Raft: Part 2: Commands and Log Replication

manigandham · on March 1, 2020

Go has a nice set of polished libraries now for distributed computing (hashicorp/raft, etc) that makes it easy to start new projects. I wish other languages also had as many, might have to start porting.

NewJazz · on March 1, 2020

https://raft.readthedocs.io/en/latest/

dnautics · on March 1, 2020

Raft is incredibly easy to write correctly, with tests, since basically everything you need is in figure 2. I've done it once, but then lost all of the code because I got mugged, my computer was stolen, and I forgot to make a github repo, and am rebuilding it currently (current one I'm doing clocks at about 700 SLOC, including tests and it's about 3/4 done).

rubiquity · on March 1, 2020

Raft is not easy to implement. People think it is. But that’s only because they have implemented the happy paths, which are hard enough to get right as it is.

monadic2 · on March 1, 2020

Could you explain which parts you found difficult to implement?

gozzoo · on March 3, 2020

The alternative to the happy paths would be all unexptecd things that might happen. They would be easy to implement only if we can forsee them. I think the diffuculty comes from our limited abiliyt to to predict all situations where things may go wrong.

gozzoo · on March 3, 2020

A classic example of this is when in 2006 it was discovered that nearly all implementations of binary search and mergsort are broken - more than 50 years after these algorithms where invented, and after they've been implemented thousands of times by the brghtest minds in computer science.

Joshua Bloch (of the Java fame) bloged about this [1] and here is the main take-away:

> The key lesson was to carefully consider the invariants in your programs.

[1] https://ai.googleblog.com/2006/06/extra-extra-read-all-about...

dnautics · on March 1, 2020

that's why you do property tests with unreliable mock networks.

frant-hartm · on March 1, 2020

Raft is easy to understand. It may be easier to write compared to e.g. Paxos, but that's it.

The consensus module and log are not very useful on their own. You need the integration with multiple concurrent clients, some state machine (even key value store will be complicated). This is still hard and doesn't get any easier with raft.

manigandham · on March 1, 2020

It might be simple to implement the various operations but it's not easy to get perfect and reliable. I rather have a production-ready library used by major projects instead of something I write myself.

hu3 · on March 1, 2020

I'm sorry your pc got stolen. Just wanted to say keep it up!

beders · on March 1, 2020

Wouldn't it be nice if this was actually part of the OS networking stack?

Wouldn't that make it easier to write correct (for some form of correctness) distributed applications, leaving the messy details to a proven lower-level stack?

antoinealb · on March 1, 2020

I wrote my master thesis on putting raft inside the RPC layer exactly for that reason. We arrived at the conclusion that this was indeed a very good way to easily provide distributed consensus to the application layer.

nujabe · on March 2, 2020

Interesting, do you mind sharing your thesis?

antoinealb · on March 2, 2020

Here it is: https://github.com/antoinealb/master-thesis/blob/master/thes...

Happy to discuss it in more details as well :)

wpietri · on March 1, 2020

Ah, nice. My first introduction to systems with replayable command logs at their heart was Prevalyer, and it really changed the way I think about system design for the better. I'm excited to try this out.