Jay Chia - daft.ai @JayChia5 X Profile

Jay Chia - daft.ai

@JayChia5

Followers

358

Following

151

Media

56

Statuses

424

Cofounder @ Eventual. Works on Daft (https://t.co/f2BxW6m2uo) the data engine for AI. LESS OOM MORE ZOOM

San Francisco, CA

Joined August 2022

Don't wanna be here? Send us removal request.

Jay Chia - daft.ai

@JayChia5

1 year

Late night rant: Spark is an awesome piece of software. But a horrible developer experience. What happened to OSS that was simply `apt install` and 🚀? Why should software be excused for slow local performance because it was built for "production scale"?. So much of "big data".

0

1

18

Jay Chia - daft.ai

@JayChia5

11 days

Ohhhhh yeah data is so back baby.

swyx

@swyx

12 days

i think this is comparatively slept on. the Current Thing if you depart from OAI (eg TML?) you are doing "we are extremely cracked and we will do custom RL for you!" as a service. 3 dudes in a basement immediately worth $500m. so it's notable when a former OAI RL person says

0

2

Jay Chia - daft.ai

@JayChia5

17 days

“What’s your Roman Empire?”. “Daft. URL downloads, dynamic streaming execution….”. 😭😭😭.

Daft

@daftengine

18 days

A glimpse into life at Eventual, Series A announcement edition! Thank you all again for the love and support we received last week. Fun fact: Our meeting rooms are named after Daft Punk songs. Can you guess who is under the Daft Punk helmet?

0

4

Jay Chia - daft.ai

@JayChia5

19 days

Actually crazy. Went through our emails and yes we interviewed him back in 2022 for our first hire at Eventual. Did not make it past our bar, but I remember him being decent. Seems there's a pattern also of him approaching open-source companies.

Suhail

@Suhail

19 days

PSA: there’s a guy named Soham Parekh (in India) who works at 3-4 startups at the same time. He’s been preying on YC companies and more. Beware. I fired this guy in his first week and told him to stop lying / scamming people. He hasn’t stopped a year later. No more excuses.

1

4

15

Jay Chia - daft.ai

@JayChia5

24 days

Like if your natural pace is 100 hours a week… all the more power to you lol.

0

Jay Chia - daft.ai

@JayChia5

24 days

Interestingly this doesn’t necessarily mean working LESS hours. I actually feel like you could work more hours overall doing this because the work is so much less draining if it isn’t just “busy-work”. See: late night manic coding because of an obsession with quality.

1

0

Jay Chia - daft.ai

@JayChia5

24 days

Slow Productivity in the world of tech startups… thoughts?. So far at Eventual the biggest inflection points feel like they’ve come from deliberate thinking and obsessive engineering rather than frenetic 50-calls-a-week-GTM that is perhaps the more common SaaS startup advice

1

0

6

Jay Chia - daft.ai

@JayChia5

25 days

Hold on did I miss something? Querying a long-form SEC 10-k document is still some crazy chained data pipeline in 2025. Something something RAG something embedding something something chunking + LLM calls 🤮🤮. No wonder there’s like 200 document parsing startups.

1

0

4

Jay Chia - daft.ai

@JayChia5

27 days

Incredibly excited to work with the team at @felicis especially @AstasiaMyers who has been an absolute POWERHOUSE for us.

Felicis

@felicis

27 days

Multimodal is the new default for AI, and legacy infrastructure can’t handle it. Eventual built @DaftEngine to redefine how we process video, audio, and images at scale. We’re proud to lead their Series A and work with @sammy_sidhu and @JayChia5. 👉

0

2

12

Jay Chia - daft.ai

@JayChia5

27 days

If you’re an ML/AI researcher, let’s work on your data for pre/post-training. If you’re an AI application engineer, let’s talk about your multimodal applications. If you’re an engineer who is frustrated with Spark, let’s get you on a data engine that’s built for 2025.

0

1

Jay Chia - daft.ai

@JayChia5

27 days

🐍We're going to accelerate the entire industry towards software that works natively with both multimodal data and LLMs. We're building industry-best solutions for data/AI systems that work on a single machine, on a cluster, on GPUs and also with remote LLMs.

1

0

1

Jay Chia - daft.ai

@JayChia5

27 days

🚀We're growing a team of incredible engineers with both the breadth and depth across domains such as: databases/data systems (ex-Databricks, CMU), distributed systems (ex-AWS, Render), ML/AI (ex-Tesla, Nvidia, Github Copilot) and Product (Stripe, Uber).

1

0

Jay Chia - daft.ai

@JayChia5

27 days

🔥 We're now a Series A company with the funding, investors and partners to make our vision a reality.

1

0

Jay Chia - daft.ai

@JayChia5

27 days

I couldn't be more proud of our team for this AMAZING milestone today!. To everyone who's followed this journey from the very beginning: a heartfelt thank you, and a promise - there's a revolution coming for multimodal data and AI. We're leading the charge :).

Sammy Sidhu

@Sammy_Sidhu

27 days

Today we're announcing that Eventual has raised $30M in Seed and Series A funding from @CRV and @felicis as well as @ycombinator, @M12vc and @Citi and others. The AI era needs data infrastructure built for AI, not retrofitted. 🧵

2

0

17

Jay Chia - daft.ai

@JayChia5

2 months

In fact, @desmondcheongzx vibe-coded a custom data sink that would stitch images in the dataframe together into a video and save that as the output of the pipeline. Wacky, but tbh sky’s the limit here :).

0

1

Jay Chia - daft.ai

@JayChia5

2 months

Multimodal/unstructured data often means user-defined data. That’s why you’re going to need User-Defined Data Sources and Sinks. This is how you get the best-in-class performance from the daft engine + integration with whatever crazy format you can cook up.

Daft

@daftengine

2 months

Introducing User-Defined Data Sources & Sinks. Now you can write to any format – propriety, vectorDB, whatever — with full distributed power in Daft. We even wrote a @trychroma sink in ~100 lines, LIVE demo + PR open 🔥

2

0

5

Jay Chia - daft.ai

@JayChia5

2 months

RT @criccomini: .@daftengine is drop-in API compatible with PySpark now! 😈

0

3

0

Jay Chia - daft.ai

@JayChia5

2 months

Daft is now PySpark API-compatible :). Switching your Spark code to Daft is literally 2 lines of code. ```.from daft.pyspark import SparkSession.spark = SparkSession.builder.local().getOrCreate().```. #AntiSparkSocialClub.

Daft

@daftengine

2 months

✅No JVMs ✅No JARs ✅Local or Distributed ✅One engine for all. Daft #LaunchWeek Day 3: SPARK CONNECT FOR DAFT 🚀. Switch from PySpark to Daft with just TWO lines of code and run the SAME Spark queries, but faster and simpler. And easily scale from local to distributed with Ray.

2

6

26

Jay Chia - daft.ai

@JayChia5

2 months

Still lots of work that needs to be done here, but I cannot be more excited for this to now be the default execution model in Daft.

0

Jay Chia - daft.ai

@JayChia5

2 months

Any data engine working with multimodal data needs to be streaming-based and do this intelligent batching. Thus we enjoy the benefits of both parallelism/vectorization as well as memory stability. Otherwise… like in Spark you’re going to have massive OOM issues….

1

3

Jay Chia - daft.ai

@JayChia5

2 months

Cannot be understated how much of a paradigm shift this was for @daftengine . In analytics, your data usually gets SMALLER (aggregations, groupby etc). In multimodal/AI… it tends to EXPAND with HUGE heap memory usage. Think: downloading data from urls, running models etc.

Daft

@daftengine

2 months

🔥 Fixed batch sizes are old news. DAY 2 OF DAFT #LAUNCHWEEK: Introducing Dynamic Execution for Multimodal Data Processing. Daft is built to adapt in real time to multimodal workloads. Resize images, upload to S3, write to parquet – All optimized. All in one pipeline. ✨ Ditch

1

0

3