Alex Miller @AlexMillerDB X Profile

Alex Miller

@AlexMillerDB

Followers

3K

Following

403

Media

75

Statuses

942

Databases. See also @[email protected] or @alexmillerdb.bsky.app

Joined May 2014

Don't wanna be here? Send us removal request.

Alex Miller

@AlexMillerDB

22 days

I can’t seem to think of blog posts I’ve seen over time that reflect all the things I had to learn by failing at them.

2

0

Alex Miller

@AlexMillerDB

22 days

I’m thinking of things like: * Driving consensus between conflicting (busy) approvers * Ensuring you get proper feedback from design docs. * Doing good work is as important as performatively doing good work * How to make the case for a new project.

1

0

3

Alex Miller

@AlexMillerDB

22 days

Does anyone have links to good writing on the sort of soft skills you learn from working in larger organizations about how to work in larger organizations as an IC? The overall space of soft skills dealing with the pretty common ways that large corporations behave.

2

0

7

Alex Miller

@AlexMillerDB

1 month

The recording from our last South Bay Systems meetup is now available! https://t.co/4Q7Q4d5bFo

0

8

43

Alex Miller

@AlexMillerDB

2 months

[PVLDB] Enhancing Transaction Processing through Indirection Skipping https://t.co/SiQB5ohFEJ Whereas VMCache improve pointer swizzing's complexity by removing the swizzling, this work points out that page and frame hints are highly effective, and okay if they're wrong.

2

6

43

Alex Miller

@AlexMillerDB

2 months

Then, we'll have "The Evolution of Semi-Structured Data Analytics" by Owen Xiao, Co-founder of VeloDB and PMC member of Apache Doris, where we'll hear about the difficulties of analytics on semi-structured data, and the approach that Apache Doris took to address them.

0

3

Alex Miller

@AlexMillerDB

2 months

Our first talk is "Low-Latency Serving on Cloud Object Stores with Apache Pinot" by Songqiao Su and Raghav Yadav, both Staff Engineers from StarTree, to talk about storage tiering and iceberg support under low-latency, real-time analytics requirements.

1

0

3

Alex Miller

@AlexMillerDB

2 months

South Bay Systems returns on October 27th at Adobe in downtown San Jose. We have an Analytics-on-Object-Storage double feature this time starring two different Apache projects: Apache Pinot and Apache Doris. (Talk descriptions below.) Register now!

luma.com

Welcome to another edition of South Bay Systems! This time, we'll have a double feature! First we'll have Songqiao Su and Raghav Yadav talking about…

1

3

18

Alex Miller

@AlexMillerDB

3 months

[ASPLOS'25] Fusion: An Analytics Object Store Optimized for Query Pushdown https://t.co/ntBQ3njtlw Tightly integrating an Iceberg catalog with an object store means that one could make file-format aware erasure coding decisions, to permit pushing down filters and aggregations.

1

14

104

Alex Miller

@AlexMillerDB

3 months

[VLDB] Towards Principled, Practical Document Database Design https://t.co/EA819FKjLo If you've ever wished that there was a document database equivalent for relational databases' 3NF-style schema design guidance, then this is the paper for you.

0

8

56

Alex Miller

@AlexMillerDB

3 months

[arXiv] On the Theoretical Limitations of Embedding-Based Retrieval https://t.co/z5i3qCDnaq It's impossible to retrieve all combinations of pairs of documents post-embedding. Thus, there's usecases that vector search won't do well at. Conversely, BM25 excels in these cases.

1

28

Qian Li

@qianl_cs

3 months

WebAssembly is Cool! (finally!) - The South Bay Systems talk series is back on Oct 2nd. Jakob will talk about the history and novel use cases of WebAssembly. Perfect timing too: Wasm 3.0 just released, and it is turning 10 this year. From browsers to embedded systems and beyond,

1

4

16

Alex Miller

@AlexMillerDB

3 months

Thread on it with a better overview from the author. Great paper, @g_sehgal1997!

Gaurav Sehgal

@g_sehgal1997

5 months

🎉 Thrilled to announce that our paper "NaviX" on vector search has been accepted to one of the top systems conferences, @VLDBconf! 🚀 Happening in London 🎡 this September. 📝 : https://t.co/HzPyBMKurI 🧑‍💻 : https://t.co/AnyIYUxXnH 1/18

1

0

3

Alex Miller

@AlexMillerDB

3 months

[VLDB] NaviX: A Native Vector Index Design for Graph DBMSs With Robust Predicate-Agnostic Search Performance https://t.co/74gmNi9qMh It feels like a follow-on/improvement to ACORN. Also interesting to see HNSW built directly on a graph database working well.

1

2

20

Alex Miller

@AlexMillerDB

4 months

[arXiv] Theseus: A Distributed and Scalable GPU-Accelerated Query Processing Platform Optimized for Efficient Data Movement https://t.co/Gn3mgXFRAn Great to see that Voltron Data folk writing about their GPU database!

0

5

80

Alex Miller

@AlexMillerDB

4 months

Relatedly, there are PRNGs faster than mersenne twister that are of reasonable quality. Xoshiro family seems well respected https://t.co/Ri1TqPtD9K, and vectorizes! https://t.co/UiUNANRWzp and https://t.co/lWSnHNFopp are also interesting reads.

0

1

3

Alex Miller

@AlexMillerDB

4 months

To randomly sample a number of operations, one pulls from a PRNG. https://t.co/QBeHJYZh0U instead shows a cute trick for defining a stateless PRNG: pull RDTSC, run it through a quick hash to scramble the bits (e.g. rapidhash). Cache-miss-free, but you lose determinism in tests.

github.com

MySQL RP (Restore Performance) is modified version of MySQL Community, to restore performance equal to or better than previous major versions. - buildup-db/mysql-server-RP

2

0

24

Qian Li

@qianl_cs

4 months

Had our biggest South Bay Systems meetup yet last night! Thanks to everyone who came and joined the vibrant discussion. Big thanks to @databricks for hosting! @andy_pavlo from CMU gave a deep dive into the 50-year history of database tuning, his work applying AI/ML to the

2

7

97

Alex Miller

@AlexMillerDB

5 months

Attention, South Bay folk! We have The Databaseologist, @andy_pavlo, giving a talk in the bay on August 6th. Come join us for a great time in hearing: ChatGPT Ain’t Got $%@& On Me! The Future of Automated Database Tuning Register now!

luma.com

We're excited to feature Andy Pavlo, illustrious database professor at CMU, to talk about database tuning. This meetup's venue, food and drinks, are generously…

1

9

53

Alex Miller

@AlexMillerDB

5 months

I had missed @ssougou's blog post series on consensus when it was originally posted. I really like the perspective of breaking down Raft/Paxos/etc. into the individual actions that comprise consensus. https://t.co/d80Q34RjX7

planetscale.com

This is a multi-part blog series and will be updated with links to the corresponding posts.

1

24

241