Jack Vanlightly @vanlightly X Profile

Jack Vanlightly

@vanlightly

Followers

4K

Following

1K

Media

104

Statuses

2K

@confluentinc thinking about event streaming. Ex @Splunk, @VMware https://t.co/3axXZezyy4, https://t.co/voJWmL4KM6 Credit: ESO/B. Tafreshi

Barcelona, Spain

Joined November 2016

Don't wanna be here? Send us removal request.

Jack Vanlightly

@vanlightly

10 months

I've written 18 posts (and counting) on table format internals. I've created a page that contains the list of my writings on the subject, including my formal verification work. Any suggestions on further table format analysis?.

jack-vanlightly.com

I’ve created this page to make it easier for me to share links about my writing on table format internals. Currently, it includes Apache Iceberg, Delta Lake, Apache Hudi, and Apache Paimon.

2

14

124

Jack Vanlightly

@vanlightly

2 days

New deep dive: Understanding Apache Fluss. I spent August reverse-engineering Fluss, Alibaba’s new table storage engine for Flink (partially forked from Kafka). This post covers its architecture, tiering, and how it tackles changelogs & low-latency state.

jack-vanlightly.com

This is a data system internals blog post. So if you enjoyed my table formats internals blog posts , or writing on Apache Kafka internals or Apache BookKeeper internals , you might enjoy this one....

2

17

133

Jack Vanlightly

@vanlightly

14 days

There's been some talk around storage unification and zero-copy lakehouse integrations recently, and I wanted to better define these terms as well as look at ways we should evaluate different design decisions in this space. So I’ve published a new post: A Conceptual Model for.

0

6

48

Jack Vanlightly

@vanlightly

1 month

In a future of autonomous AI agents, we can't limit ourselves to error prevention and error detection, we must also include remediation. But when AI loses touch with reality due to hallucinations, confabulation and misinterpretation, who does the remediation? In cases of.

0

2

Jack Vanlightly

@vanlightly

1 month

Science moves slowly because wrong theories waste decades. Engineering is careful because failures kill people. Software moves fast because mistakes are cheap, the expensive error isn't making the wrong choice, it's taking too long to make any choice.

jack-vanlightly.com

A recent LinkedIn post by Nick Lebesis caught my attention with this brutal take on the difference between good startup founders and coward startup founders. I recommend you read the entire thing to...

3

11

56

Jack Vanlightly

@vanlightly

2 months

A new case study is born.

Jason ✨👾SaaStr.Ai✨ Lemkin

@jasonlk

2 months

.@Replit goes rogue during a code freeze and shutdown and deletes our entire database

0

8

Jack Vanlightly

@vanlightly

2 months

In distributed systems, reliability isn’t just about retries and durability, it’s about knowing who owns recovery. My latest post, based on the Coordinated Progress model I posted previously, explores how reliable triggers create responsibility boundaries and how those boundaries.

0

15

101

Jack Vanlightly

@vanlightly

3 months

Over the past few months, I’ve been thinking deeply about how systems make progress reliably in the face of partial failures, service boundaries, retries, and complex dependencies. Building reliable workflows across microservices, functions, and stream processors is one of the.

3

16

61

Jack Vanlightly

@vanlightly

3 months

How to reliably distribute work across microservices, stream processors, durable execution, event-driven, orchestration and now AI agents?. Coordinated Progress is a 4-part series that explores the common structure behind reliable distributed systems.

jack-vanlightly.com

At some point, we’ve all sat in an architecture meeting where someone asks, “ Should this be an event? An RPC? A queue? ”, or “ How do we tie this process together across our microservices? Should it...

1

31

183

Jack Vanlightly

@vanlightly

5 months

Another Humans of the Data Sphere is out, with issue 10! In this issue people are talking fsyncs, tips for running ClickHouse at scale, the problems with MCP and more. Plus I dig up a classic paper from 1962.

hotds.dev

Your biweekly dose of insights, observations, commentary and opinions from interesting people from the world of databases, AI, streaming, distributed systems and the data engineering/analytics space.

0

2

7

Jack Vanlightly

@vanlightly

5 months

The specs are in this repo for anyone interested.

github.com

TLA+ specifications for Kafka related algorithms. Contribute to Vanlightly/kafka-tlaplus development by creating an account on GitHub.

0

5

Jack Vanlightly

@vanlightly

5 months

kafka.apache.org

Apache Kafka: A Distributed Streaming Platform.

0

3

Jack Vanlightly

@vanlightly

5 months

Proud to have contributed formal verification (TLA+) for three key improvements in Kafka 4.0:. ✅ KIP-966: Strengthens the replication protocol. ✅ KIP-996: Introduces PreVote for more stable KRaft leadership. ✅ KIP-848: Delivers more efficient, predictable rebalancing.

4

20

115

Jack Vanlightly

@vanlightly

6 months

Seems like I’m not alone. For what it’s worth, I’ve got a great fit at Confluent — but the more senior I get, the more I wonder how sustainable that is across future PE roles. Thinking of writing a blog post, maybe with interviews or perspectives from PEs who aren’t natural cat.

Jack Vanlightly

@vanlightly

6 months

Any Principal Engineers out there with ADHD or creative wiring — who don’t thrive in the tasks of project coordination, alignment meetings, and people management, but thrive on strategy, system design, writing, and shaping direction through ideas? Curious how you navigate the.

0

1

30

Jack Vanlightly

@vanlightly

6 months

RT @ijuma: And the old group coordinator implementation is gone from Apache Kafka - love it when open-source projects can delete large chun….

github.com

This patch is the third of a series of patches to remove the old group coordinator. With the release of Apache Kafka 4.0, the so-called new group coordinator is the default and only option availabl...

0

3

0

Jack Vanlightly

@vanlightly

6 months

Any Principal Engineers out there with ADHD or creative wiring — who don’t thrive in the tasks of project coordination, alignment meetings, and people management, but thrive on strategy, system design, writing, and shaping direction through ideas? Curious how you navigate the.

19

8

181

Jack Vanlightly

@vanlightly

6 months

A new disaggregated log replication survey post is out. How does the combination of Apache Pulsar with Apache BookKeeper divide and conquer the responsibilities of log replication?

jack-vanlightly.com

In this latest post of the disaggregated log replication survey, we’re going to look at the Apache BookKeeper Replication Protocol and how it is used by Apache Pulsar to form topic partitions. Raft...

0

19

99

Jack Vanlightly

@vanlightly

6 months

Another Humans of the Data Sphere is out, with issue #9! In this issue, we also look at whether software engineers can learn from mechanical engineering, and looking at table formats as a form of virtualization.

hotds.dev

Your biweekly dose of insights, observations, commentary and opinions from interesting people from the world of databases, AI, streaming, distributed systems and the data engineering/analytics space.

0

2

8

Jack Vanlightly

@vanlightly

6 months

RT @ankushpd: If you are looking for formal models of a real-world distributed system, DeepSeek @deepseek_ai released P specifications for….

0

42

0

Jack Vanlightly

@vanlightly

6 months

A new log replication disaggregation survey post is out! .The Kafka Replication Protocol:.🔹Separation of control plane from data plane. 🔹Role separation with minimal coupling. 🔹Kafka’s alignment with Paxos roles.

jack-vanlightly.com

In this post, we’re going to look at the Kafka Replication Protocol and how it separates control plane and data plane responsibilities. It’s worth noting there are other systems that separate...

2

17

119