definekuujo Profile Banner
Jordan Halterman ✌️ Profile
Jordan Halterman ✌️

@definekuujo

Followers
392
Following
2K
Media
83
Statuses
3K

Senior Staff Engineer/Tech Lead - Cloud & AI software @intel. Formerly @ONF_SDN, @OracleDataSci. OSS. Distributed systems. AI. Cloud computing. Formal methods.

California
Joined July 2012
Don't wanna be here? Send us removal request.
@heidiann360
Heidi Howard
2 years
Transactions that are fast, simple and fault tolerant! 😲 exciting work by @ChrisJe34211511
@akatsarakis
Antonis Katsarakis
2 years
Wait… what!? Fault tolerant 2PC that is simple and commits in 1RTT? 🤯 If you missed @ChrisJe34211511 fantastic talk in #eurosys24 (#papoc24) definitely check the paper https://t.co/KTzt87dMjS
1
15
76
@DominikTornow
Dominik Tornow
2 years
@definekuujo When I first saw @TigerBeetleDB's Simulator and how every possible failure is enumerated and subsequently tested for, I was instantly hooked Model checking the actual system 🏴‍☠️ https://t.co/ldzIzWULBJ
1
6
16
@definekuujo
Jordan Halterman ✌️
2 years
Jordan’s rules for building distributed systems: #1 - don’t #2 - if you have to anyways, at least don’t write the algorithms yourself #3 - if you have to anyways, at least don’t do it without formal verification #4 - if you have to anyways, go back to #1
1
0
12
@indygupta
Indranil Gupta
5 years
Types of Distributed Systems Papers. Joke modeled after @xkcd 's https://t.co/XbBOojmBjt #distributedsystems #distributedsystemsjokes
0
68
250
@definekuujo
Jordan Halterman ✌️
2 years
It seems sharing it was worthwhile! I think SDN-based approaches to consensus are really promising, and with some time and effort we could see similar techniques brought to real world systems. I wish I had the opportunity to work on consensus algorithms more often. I miss it 😌
@definekuujo
Jordan Halterman ✌️
2 years
I was cleaning out my GitHub repos the other day and came across this old gem I’d totally forgotten about. When I was at the Open Networking Foundation, I did some work researching low latency consensus using SDN-enabled clock synchronization protocols. https://t.co/6Md3vG4dyC
0
0
1
@definekuujo
Jordan Halterman ✌️
2 years
Ugh that’s what I get for typing this on my phone
0
0
0
@definekuujo
Jordan Halterman ✌️
2 years
I’m throwing this one over the wall… enjoy!
0
0
0
@definekuujo
Jordan Halterman ✌️
2 years
I’d certainly expect its performance to degrade significantly under high load at least. (although hopefully no more than a traditional consensus algorithm). It’s clearly not ready for the real world. But I thought it would be interesting to share nonetheless… for posterity.
1
0
0
@definekuujo
Jordan Halterman ✌️
2 years
Remember, this protocol is experimental! It’s just an idea. It’s been literally years since I looked at it, and IIRC there was still at least one glaring performance issue with it when I last left it.
1
0
0
@definekuujo
Jordan Halterman ✌️
2 years
When a message arrives out of order, the order is reconciled by comparing the logs with a strong leader. This means coordination only occurs when timestamps decrease, and no coordination is required so long as they increase.
1
0
0
@definekuujo
Jordan Halterman ✌️
2 years
The JIT Paxos protocol itself is largely s derived from on Viewstamped Replication (leader, views, etc). Requests are sent by the client to all replicas, and consensus is achieved in a single round trip as long as messages arrive in wall clock order.
2
0
0
@definekuujo
Jordan Halterman ✌️
2 years
I thought the authors of the NOPaxos paper were onto something in using SDN to optimize consensus, so I started to wonder whether the same approach could be applied by essentially replacing the synchronizer (a single point of failure) with a clock synchronization protocol (not).
1
0
0
@definekuujo
Jordan Halterman ✌️
2 years
When we were contacted by a team from (ugh I don’t remember which university sorry!) who had used SDN to create a clock synchronization that was accurate to within nanoseconds, we started asking how this could be used to optimize distributed systems (aside from the obvious).
1
0
0
@definekuujo
Jordan Halterman ✌️
2 years
The seed was planted in my mind after reading the NOPaxos paper, which used SDN to sequence packets before they’re replicated, enabling consensus in a single round trip in the normal case, and falling back to a more traditional consensus protocol when packets were dropped.
1
0
1
@definekuujo
Jordan Halterman ✌️
2 years
I was cleaning out my GitHub repos the other day and came across this old gem I’d totally forgotten about. When I was at the Open Networking Foundation, I did some work researching low latency consensus using SDN-enabled clock synchronization protocols. https://t.co/6Md3vG4dyC
Tweet card summary image
github.com
Experimental Paxos variant focused on low-latency consensus, derived from NOPaxos and Viewstamped Replication, using clock synchronization to avoid unnecessary round trips - kuujo/just-in-time-paxos
1
8
49
@definekuujo
Jordan Halterman ✌️
2 years
Of course, in this case the leader has committed an entry from its term, but entry 9 has not been replicated to a majority of nodes, so it can be lost regardless.
0
0
0
@definekuujo
Jordan Halterman ✌️
2 years
Prior to committing an entry from its term, entries in a leader’s term can be overwritten even if they’re stored on a majority of replicas. See figure 8 in the original Raft paper for a deep dive into how and why that can happen: https://t.co/eAel98XVx0 3/
2
0
2
@definekuujo
Jordan Halterman ✌️
2 years
Note that I did *not* say an entry is guaranteed to be retained once it’s stored on a majority of nodes. That guarantee only holds once the leader has committed an entry in its term. This is such an essential part of ensuring safety in the Raft protocol. 2/
1
0
0
@definekuujo
Jordan Halterman ✌️
2 years
Yes. If a new leader is elected for any reason, that leader can overwrite the entry with an entry from its term. An entry is not guaranteed to be retained until a leader commits it (commit index >= entry index). 1/
@DominikTornow
Dominik Tornow
2 years
Raft Consensus Challenge II ⛓️ We have a Raft cluster with 3 nodes, each maintaining a replica of the log. Node 2 is the leader of term 3, accepted client request 9, & added 9 to its log. Can 9 be lost? Why? https://t.co/Yqfhp0Az1N
1
1
4
@definekuujo
Jordan Halterman ✌️
2 years
It’s also worth noting that while an entry can be appended to followers before the leader, the leader can still only commit that entry (increment the commit index) after it’s stored in its log. A leader can’t just replicate an entry to quorum of followers and tell them to commit.
0
0
0