Jordan Halterman ✌️ @definekuujo X Profile

Jordan Halterman ✌️

@definekuujo

Followers

392

Following

2K

Media

83

Statuses

3K

Senior Staff Engineer/Tech Lead - Cloud & AI software @intel. Formerly @ONF_SDN, @OracleDataSci. OSS. Distributed systems. AI. Cloud computing. Formal methods.

https://t.co/SGQ5Y7GSd1

California

Joined July 2012

Don't wanna be here? Send us removal request.

Heidi Howard

@heidiann360

2 years

Transactions that are fast, simple and fault tolerant! 😲 exciting work by @ChrisJe34211511

Antonis Katsarakis

@akatsarakis

2 years

Wait… what!? Fault tolerant 2PC that is simple and commits in 1RTT? 🤯 If you missed @ChrisJe34211511 fantastic talk in #eurosys24 (#papoc24) definitely check the paper https://t.co/KTzt87dMjS

1

15

76

Dominik Tornow

@DominikTornow

2 years

@definekuujo When I first saw @TigerBeetleDB's Simulator and how every possible failure is enumerated and subsequently tested for, I was instantly hooked Model checking the actual system 🏴‍☠️ https://t.co/ldzIzWULBJ

1

6

16

Jordan Halterman ✌️

@definekuujo

2 years

Jordan’s rules for building distributed systems: #1 - don’t #2 - if you have to anyways, at least don’t write the algorithms yourself #3 - if you have to anyways, at least don’t do it without formal verification #4 - if you have to anyways, go back to #1

1

0

12

Indranil Gupta

@indygupta

5 years

Types of Distributed Systems Papers. Joke modeled after @xkcd 's https://t.co/XbBOojmBjt #distributedsystems #distributedsystemsjokes

0

68

250

Jordan Halterman ✌️

@definekuujo

2 years

It seems sharing it was worthwhile! I think SDN-based approaches to consensus are really promising, and with some time and effort we could see similar techniques brought to real world systems. I wish I had the opportunity to work on consensus algorithms more often. I miss it 😌

Jordan Halterman ✌️

@definekuujo

2 years

I was cleaning out my GitHub repos the other day and came across this old gem I’d totally forgotten about. When I was at the Open Networking Foundation, I did some work researching low latency consensus using SDN-enabled clock synchronization protocols. https://t.co/6Md3vG4dyC

0

1

Jordan Halterman ✌️

@definekuujo

2 years

Ugh that’s what I get for typing this on my phone

0

Jordan Halterman ✌️

@definekuujo

2 years

I’m throwing this one over the wall… enjoy!

0

Jordan Halterman ✌️

@definekuujo

2 years

I’d certainly expect its performance to degrade significantly under high load at least. (although hopefully no more than a traditional consensus algorithm). It’s clearly not ready for the real world. But I thought it would be interesting to share nonetheless… for posterity.

1

0

Jordan Halterman ✌️

@definekuujo

2 years

Remember, this protocol is experimental! It’s just an idea. It’s been literally years since I looked at it, and IIRC there was still at least one glaring performance issue with it when I last left it.

1

0

Jordan Halterman ✌️

@definekuujo

2 years

When a message arrives out of order, the order is reconciled by comparing the logs with a strong leader. This means coordination only occurs when timestamps decrease, and no coordination is required so long as they increase.

1

0

Jordan Halterman ✌️

@definekuujo

2 years

The JIT Paxos protocol itself is largely s derived from on Viewstamped Replication (leader, views, etc). Requests are sent by the client to all replicas, and consensus is achieved in a single round trip as long as messages arrive in wall clock order.

2

0

Jordan Halterman ✌️

@definekuujo

2 years

I thought the authors of the NOPaxos paper were onto something in using SDN to optimize consensus, so I started to wonder whether the same approach could be applied by essentially replacing the synchronizer (a single point of failure) with a clock synchronization protocol (not).

1

0

Jordan Halterman ✌️

@definekuujo

2 years

When we were contacted by a team from (ugh I don’t remember which university sorry!) who had used SDN to create a clock synchronization that was accurate to within nanoseconds, we started asking how this could be used to optimize distributed systems (aside from the obvious).

1

0

Jordan Halterman ✌️

@definekuujo

2 years

The seed was planted in my mind after reading the NOPaxos paper, which used SDN to sequence packets before they’re replicated, enabling consensus in a single round trip in the normal case, and falling back to a more traditional consensus protocol when packets were dropped.

1

0

1

Jordan Halterman ✌️

@definekuujo

2 years

I was cleaning out my GitHub repos the other day and came across this old gem I’d totally forgotten about. When I was at the Open Networking Foundation, I did some work researching low latency consensus using SDN-enabled clock synchronization protocols. https://t.co/6Md3vG4dyC

github.com

Experimental Paxos variant focused on low-latency consensus, derived from NOPaxos and Viewstamped Replication, using clock synchronization to avoid unnecessary round trips - kuujo/just-in-time-paxos

1

8

49

Jordan Halterman ✌️

@definekuujo

2 years

Of course, in this case the leader has committed an entry from its term, but entry 9 has not been replicated to a majority of nodes, so it can be lost regardless.

0

Jordan Halterman ✌️

@definekuujo

2 years

Prior to committing an entry from its term, entries in a leader’s term can be overwritten even if they’re stored on a majority of replicas. See figure 8 in the original Raft paper for a deep dive into how and why that can happen: https://t.co/eAel98XVx0 3/

2

0

2

Jordan Halterman ✌️

@definekuujo

2 years

Note that I did *not* say an entry is guaranteed to be retained once it’s stored on a majority of nodes. That guarantee only holds once the leader has committed an entry in its term. This is such an essential part of ensuring safety in the Raft protocol. 2/

1

0

Jordan Halterman ✌️

@definekuujo

2 years

Yes. If a new leader is elected for any reason, that leader can overwrite the entry with an entry from its term. An entry is not guaranteed to be retained until a leader commits it (commit index >= entry index). 1/

Dominik Tornow

@DominikTornow

2 years

Raft Consensus Challenge II ⛓️ We have a Raft cluster with 3 nodes, each maintaining a replica of the log. Node 2 is the leader of term 3, accepted client request 9, & added 9 to its log. Can 9 be lost? Why? https://t.co/Yqfhp0Az1N

1

4

Jordan Halterman ✌️

@definekuujo

2 years

It’s also worth noting that while an entry can be appended to followers before the leader, the leader can still only commit that entry (increment the commit index) after it’s stored in its log. A leader can’t just replicate an entry to quorum of followers and tell them to commit.

0