Paxos derived

no_identd · on Jan 29, 2018

Wow, that's amazing. On the topic of fault tolerance and consensus, here's a short but well done article on it:

https://ug93tad.github.io/consensus/

And on the topic of Paxos, some recent HN discussion:

https://news.ycombinator.com/item?id=16003662 - WPaxos: a wide area network Paxos protocol

https://news.ycombinator.com/item?id=13923949 - Paxos in 25 Lines

https://news.ycombinator.com/item?id=13950493 - Gryadka is not Paxos, so it's probably wrong [RETRACTED]

rystsov · on Jan 29, 2018

Some update on Gryadka - Tobias Schottdorf and Greg Rogers independently explored it with TLA+ and didn't find any issues: - https://tschottdorf.github.io/single-decree-paxos-tla-compar... - https://medium.com/@grogepodge/tla-specification-for-gryadka...

jbellis · on Jan 29, 2018

Murat's blog is underappreciated. One of the most approachable writers on distributed systems. Check out his full archives.

pkolaczk · on Jan 30, 2018

"In sum, something "fundamental" changes when you want to go fault-tolerant and tolerate node failure in an asynchronous system. When you combine faults and full-asynchrony, you get the FLP impossibility result. That means you lose progress! That is why Paxos does not guarantee making progress under a full asynchronous model with a crash failure."

This is unclear to me. Egalitarian Paxos guarantees progress under a full asynchronous model and doesn't have the dueling leaders problem. So this looks like a weakness of standard Paxos itself, not a fundamental problem.

matthelb · on Jan 30, 2018

Though it might not be explicitly stated in the paper, EPaxos has the same liveness guarantee as all other consensus protocols: commands will eventually commit if a long enough period of synchrony occurs. As the author of this post notes, this is a fundamental limitation of the specification of the consensus problem - no protocol can get around the limitation while solving consensus by definition.

Cofike · on Jan 29, 2018

I took Murat's distributed system course at UB, awesome professor and really enjoyed his lectures.

CurtMonash · on Jan 29, 2018

Similarly, Max Zorn used to ask people whether they recalled what Zorn's Lemma was introduced as a lemma to. (I haven't a clue, and I doubt most of them did either.)

kmill · on Jan 29, 2018

Page 82 of https://www.sciencedirect.com/science/article/pii/0315086078... has some information.

Zorn introduced a "maximal principle" as an axiom. It appears Tukey called a generalization of it Zorn's Lemma for unknown reasons, though there is some version of the statement that really is proved, but apparently by Chevalley (a fact Zorn knew).

Socketopp · on Jan 29, 2018

I have no idea what this is all about. Anyone care to give a simple explanation?

heavenlyblue · on Jan 30, 2018

One of the most popular problems to be solved in modern IT systems is that of keeping the state of your application distributed across a number of machines.

Paxos is one of the algorithms that solves this problem by giving you a protocol that allows the sets of machines to agree upon a set of operations that would all be applied to their states thus giving you a set of machines in the same state.

A simple example, is if you had a set of 3 machines starting with state of “0” and wanted to add “1” to their state. Paxos would define how they should communicate so that in the end, even if one of the machines failed during execution, would all end up with a state of “1”.

canadianwriter · on Jan 29, 2018

Blogspot? Now there's a TLD I haven't seen in years...

dragonwriter · on Jan 29, 2018

And you haven't seen it as a TLD now, either.

(The TLD is “com”, which you probably see fairly often in that role.)

xchaotic · on Jan 29, 2018

Are we there yet? Do we need paxos-like consensus protocols? Hardware is becoming cheaper and commoditised and with all the hype around blockchain, it looks like people are ready to pay extra for the redundant hardware needed for 100% fault tolerance. Still, it feels to me to in almost all cases, including financial transactions, it's good enough to be right 99.999% of the time and just amortise the costs of the very rare bit flip...

matthelb · on Jan 30, 2018

You bring up a good point, although I'm not sure if I agree with (or understand) the premises. I can't imagine a world where hardware and the protocols running on top will be immune to physical sources of faults, such as natural disasters, human intervention, or cosmic radiation(!). As a result, dealing with failures will always be a consideration in building distributed software systems, and Paxos or protocols that solve the same problem as Paxos will always be relevant.

I do think you make a good point that we don't always need Paxos-like protocols. Paxos is a very strong tool, so it solves a difficult problem, but is heavy-handed in many scenarios. There is a lot of space to explore lighter-weight alternatives to Paxos while still providing similarly strong properties.

fnord123 · on Jan 29, 2018

>Do we need paxos-like consensus protocols?

...yes.