Spiral
@SpiralDB
Followers
712
Following
123
Media
5
Statuses
50
The data warehouse for pre-training.
NYC & London
Joined March 2023
We're building the data infrastructure that AI actually needs. Current systems were built for humans reading dashboards. But an H100 can consume 4 million images per second. The future isn't human-scale. It's machine-scale. Introducing Spiral: Data 3.0 🌀 1/8
13
37
386
Super cool, they forked @DeltaLakeOSS to replace Parquet (for data) with Vortex and JSON (for metadata) with Vortex. Huge performance gains! Maybe we should upstream this one 😁 @vortexdotdev
🧊 New on the Polar Signals Blog — Our Delta Lake Fork Purpose-built for our continuous profiling product. In our latest post, we walk through how Delta Lake works, and the changes we've made to improve performance for our product. 👉 Read the full post: https://t.co/KqcJIe0VAq
4
5
65
1000x goes to NYC! A talk by @chaitybhandari, “TigerBeetle Consensus & TigerStyle”, a window into TigerSales from @peterahn, then Q&A and mix/mingle. Special thanks to @willmanning and @SpiralDB for having us at their brand new office! Dec 4, 5:30pm https://t.co/VUaE96yvQw
And... Paris! Whether you’re tuning databases, shaving CPU cycles and packets, optimizing AI inference, building web services, or simply curious about making things go faster... A friendly meetup all about performance, hosted by @jedisct1. Dec 4, 6pm https://t.co/JaokblDarv
0
6
9
So cool!! Polar Signals reduced query runtimes by 70% switching from Parquet to Vortex 🤯🚀
We completed a major project to switch our storage file format from Parquet to Vortex 🌪️ resulting in 70% average query performance improvement across the board 🚀 Learn more about how rethinking interface-imposed limitations unlocked these gains in our latest blog post 👇
0
3
24
The talk on @SpiralDB at @CMUDB : https://t.co/6mRfsnDZiP is a great one. I think it would also be interesting to hear a counterpoint about @ApacheParquet that explains actual technical details of that format, the Cathedral vs Bizzaar management, options with Metadata, etc
2
15
112
Today's Future Data Systems Seminar Speaker: Will Manning (@_willmanning) will present @SpiralDB's Vortex file format (@vortexdotdev). Vortex is now a @LFAIDataFdn project. Zoom talk open to public at 4:30pm ET. YouTube video available after:
db.cs.cmu.edu
Apache Parquet revolutionized columnar storage after its initial release in 2013, but... Read More +
0
8
41
Their moat is building really awesome stuff.
We're building the data infrastructure that AI actually needs. Current systems were built for humans reading dashboards. But an H100 can consume 4 million images per second. The future isn't human-scale. It's machine-scale. Introducing Spiral: Data 3.0 🌀 1/8
1
3
21
A day in early 2023: @Robert3005: We don't do this a lot, so this is, like, a really huge deal. @ngates_: We wanna invite you to have lunch with us every day. me: Oh, it's ok, I've kind of just left London. @_willmanning: Coolness. So we'll see you in a week.
We're building the data infrastructure that AI actually needs. Current systems were built for humans reading dashboards. But an H100 can consume 4 million images per second. The future isn't human-scale. It's machine-scale. Introducing Spiral: Data 3.0 🌀 1/8
1
3
14
So pumped to back @_willmanning @SpiralDB building data infrastructure for machine scale!
We're building the data infrastructure that AI actually needs. Current systems were built for humans reading dashboards. But an H100 can consume 4 million images per second. The future isn't human-scale. It's machine-scale. Introducing Spiral: Data 3.0 🌀 1/8
1
2
10
"What sets them [Spiral’s founders] apart: deep technical rigor paired with respect for academic research; relentless engineering execution balanced with creative problem solving, and a true obsession with customer needs." 💯 spiraldb dot com /careers
AI teams need a better data platform. Excited to support @SpiralDB's launch today and lead their seed: https://t.co/vvgniOhj9G
0
3
12
Congrats @_willmanning and team! We already took Vortex for a spin and had a great experience! There might more on this announced soon - sth sth dataloader! 👀
We're building the data infrastructure that AI actually needs. Current systems were built for humans reading dashboards. But an H100 can consume 4 million images per second. The future isn't human-scale. It's machine-scale. Introducing Spiral: Data 3.0 🌀 1/8
1
3
14
This is the craziest why I invested story you’ve ever read. So read it for the drama and to discover the database you need to saturate your GPUs, handle multimodal data without headaches, and build the genAI app you should be building. https://t.co/Xx2DtJDLWs
amplifypartners.com
Spiral is building the multimodal data platform AI teams really need.
4
4
30
8/ The gap between AI leaders and laggards is widening. The enterprises that get their data AI-ready today will have an insurmountable advantage tomorrow. https://t.co/CuOitUDggh
spiraldb.com
Data 3.0, backed by the best
0
0
22
7/ Backed by $22M from Amplify Partners & General Catalyst. Working with teams in computer vision, document intelligence, and multimodal AI. If you're spending >10% of your time on data infrastructure instead of model development, we should talk.
1
0
22
6/ Real talk: OpenAI and Anthropic aren't using traditional data warehouses. They built custom infrastructure because they had to. We're making that same capability accessible to everyone.
1
1
24
5/ We call it the "Third Age" of data: • First Age: Human inputs → Human outputs (Postgres era) • Second Age: Machine inputs → Human outputs (Big Data era) • Third Age: Machine inputs → Machine outputs (AI era) Your infrastructure needs to evolve.
1
1
21
4/ Spiral is our database built on Vortex: • Direct S3 → GPU data loading (skip the CPU bottleneck) • One API for 10KB embeddings AND 4GB videos • Time-bounded, audited permissions (no more credentials passed to AI agents) • Actually saturates your GPUs
1
0
19
3/ We built Vortex—a new columnar format that's 10-20x faster than Parquet for scans, 100-200x faster for random access. Microsoft, Snowflake, and Palantir are already backing it. We donated it to the Linux Foundation because infrastructure should be open.
1
1
40
2/ The problem: Your $40k/month H100 sits idle 70% of the time. Not because you lack data, but because your infrastructure can't feed it fast enough. Reading 4M images from S3? That's 55 HOURS of network overhead. For one second of GPU compute.
3
0
18