Jay Chia - daft.ai Profile
Jay Chia - daft.ai

@JayChia5

Followers
358
Following
151
Media
56
Statuses
424

Cofounder @ Eventual. Works on Daft (https://t.co/f2BxW6m2uo) the data engine for AI. LESS OOM MORE ZOOM

San Francisco, CA
Joined August 2022
Don't wanna be here? Send us removal request.
@JayChia5
Jay Chia - daft.ai
1 year
Late night rant: Spark is an awesome piece of software. But a horrible developer experience. What happened to OSS that was simply `apt install` and 🚀? Why should software be excused for slow local performance because it was built for "production scale"?. So much of "big data".
0
1
18
@JayChia5
Jay Chia - daft.ai
11 days
Ohhhhh yeah data is so back baby.
@swyx
swyx
12 days
i think this is comparatively slept on. the Current Thing if you depart from OAI (eg TML?) you are doing "we are extremely cracked and we will do custom RL for you!" as a service. 3 dudes in a basement immediately worth $500m. so it's notable when a former OAI RL person says
Tweet media one
0
0
2
@JayChia5
Jay Chia - daft.ai
17 days
“What’s your Roman Empire?”. “Daft. URL downloads, dynamic streaming execution
.”. 😭😭😭.
@daftengine
Daft
18 days
A glimpse into life at Eventual, Series A announcement edition! Thank you all again for the love and support we received last week. Fun fact: Our meeting rooms are named after Daft Punk songs. Can you guess who is under the Daft Punk helmet?
0
0
4
@JayChia5
Jay Chia - daft.ai
19 days
Actually crazy. Went through our emails and yes we interviewed him back in 2022 for our first hire at Eventual. Did not make it past our bar, but I remember him being decent. Seems there's a pattern also of him approaching open-source companies.
Tweet media one
@Suhail
Suhail
19 days
PSA: there’s a guy named Soham Parekh (in India) who works at 3-4 startups at the same time. He’s been preying on YC companies and more. Beware. I fired this guy in his first week and told him to stop lying / scamming people. He hasn’t stopped a year later. No more excuses.
1
4
15
@JayChia5
Jay Chia - daft.ai
24 days
Like if your natural pace is 100 hours a week
 all the more power to you lol.
0
0
0
@JayChia5
Jay Chia - daft.ai
24 days
Interestingly this doesn’t necessarily mean working LESS hours. I actually feel like you could work more hours overall doing this because the work is so much less draining if it isn’t just “busy-work”. See: late night manic coding because of an obsession with quality.
1
0
0
@JayChia5
Jay Chia - daft.ai
24 days
Slow Productivity in the world of tech startups
 thoughts?. So far at Eventual the biggest inflection points feel like they’ve come from deliberate thinking and obsessive engineering rather than frenetic 50-calls-a-week-GTM that is perhaps the more common SaaS startup advice
1
0
6
@JayChia5
Jay Chia - daft.ai
25 days
Hold on did I miss something? Querying a long-form SEC 10-k document is still some crazy chained data pipeline in 2025. Something something RAG something embedding something something chunking + LLM calls đŸ€źđŸ€ź. No wonder there’s like 200 document parsing startups.
1
0
4
@JayChia5
Jay Chia - daft.ai
27 days
Incredibly excited to work with the team at @felicis especially @AstasiaMyers who has been an absolute POWERHOUSE for us.
@felicis
Felicis
27 days
Multimodal is the new default for AI, and legacy infrastructure can’t handle it. Eventual built @DaftEngine to redefine how we process video, audio, and images at scale. We’re proud to lead their Series A and work with @sammy_sidhu and @JayChia5. 👉
Tweet media one
0
2
12
@JayChia5
Jay Chia - daft.ai
27 days
If you’re an ML/AI researcher, let’s work on your data for pre/post-training. If you’re an AI application engineer, let’s talk about your multimodal applications. If you’re an engineer who is frustrated with Spark, let’s get you on a data engine that’s built for 2025.
0
0
1
@JayChia5
Jay Chia - daft.ai
27 days
🐍We're going to accelerate the entire industry towards software that works natively with both multimodal data and LLMs. We're building industry-best solutions for data/AI systems that work on a single machine, on a cluster, on GPUs and also with remote LLMs.
1
0
1
@JayChia5
Jay Chia - daft.ai
27 days
🚀We're growing a team of incredible engineers with both the breadth and depth across domains such as: databases/data systems (ex-Databricks, CMU), distributed systems (ex-AWS, Render), ML/AI (ex-Tesla, Nvidia, Github Copilot) and Product (Stripe, Uber).
1
0
0
@JayChia5
Jay Chia - daft.ai
27 days
đŸ”„ We're now a Series A company with the funding, investors and partners to make our vision a reality.
1
0
0
@JayChia5
Jay Chia - daft.ai
27 days
I couldn't be more proud of our team for this AMAZING milestone today!. To everyone who's followed this journey from the very beginning: a heartfelt thank you, and a promise - there's a revolution coming for multimodal data and AI. We're leading the charge :).
@Sammy_Sidhu
Sammy Sidhu
27 days
Today we're announcing that Eventual has raised $30M in Seed and Series A funding from @CRV and @felicis as well as @ycombinator, @M12vc and @Citi and others. The AI era needs data infrastructure built for AI, not retrofitted. đŸ§”
2
0
17
@JayChia5
Jay Chia - daft.ai
2 months
In fact, @desmondcheongzx vibe-coded a custom data sink that would stitch images in the dataframe together into a video and save that as the output of the pipeline. Wacky, but tbh sky’s the limit here :).
0
0
1
@JayChia5
Jay Chia - daft.ai
2 months
Multimodal/unstructured data often means user-defined data. That’s why you’re going to need User-Defined Data Sources and Sinks. This is how you get the best-in-class performance from the daft engine + integration with whatever crazy format you can cook up.
@daftengine
Daft
2 months
Introducing User-Defined Data Sources & Sinks. Now you can write to any format – propriety, vectorDB, whatever — with full distributed power in Daft. We even wrote a @trychroma sink in ~100 lines, LIVE demo + PR open đŸ”„
2
0
5
@JayChia5
Jay Chia - daft.ai
2 months
RT @criccomini: .@daftengine is drop-in API compatible with PySpark now! 😈
Tweet media one
0
3
0
@JayChia5
Jay Chia - daft.ai
2 months
Daft is now PySpark API-compatible :). Switching your Spark code to Daft is literally 2 lines of code. ```.from daft.pyspark import SparkSession.spark = SparkSession.builder.local().getOrCreate().```. #AntiSparkSocialClub.
@daftengine
Daft
2 months
✅No JVMs ✅No JARs ✅Local or Distributed ✅One engine for all. Daft #LaunchWeek Day 3: SPARK CONNECT FOR DAFT 🚀. Switch from PySpark to Daft with just TWO lines of code and run the SAME Spark queries, but faster and simpler. And easily scale from local to distributed with Ray.
2
6
26
@JayChia5
Jay Chia - daft.ai
2 months
Still lots of work that needs to be done here, but I cannot be more excited for this to now be the default execution model in Daft.
0
0
0
@JayChia5
Jay Chia - daft.ai
2 months
Any data engine working with multimodal data needs to be streaming-based and do this intelligent batching. Thus we enjoy the benefits of both parallelism/vectorization as well as memory stability. Otherwise
 like in Spark you’re going to have massive OOM issues
.
1
1
3
@JayChia5
Jay Chia - daft.ai
2 months
Cannot be understated how much of a paradigm shift this was for @daftengine . In analytics, your data usually gets SMALLER (aggregations, groupby etc). In multimodal/AI
 it tends to EXPAND with HUGE heap memory usage. Think: downloading data from urls, running models etc.
@daftengine
Daft
2 months
đŸ”„ Fixed batch sizes are old news. DAY 2 OF DAFT #LAUNCHWEEK: Introducing Dynamic Execution for Multimodal Data Processing. Daft is built to adapt in real time to multimodal workloads. Resize images, upload to S3, write to parquet – All optimized. All in one pipeline. ✹ Ditch
1
0
3