Shreya Shankar Profile
Shreya Shankar

@sh_reya

Followers
48K
Following
14K
Media
388
Statuses
5K

doing a PhD @Berkeley_EECS, building https://t.co/PmuOqAYt6q | teaching https://t.co/CTWJ6z0JEg | formerly ML eng & undergrad @Stanford CS

Berkeley, CA
Joined January 2014
Don't wanna be here? Send us removal request.
@sh_reya
Shreya Shankar
11 months
LLMs have made exciting progress on hard tasks! But they still struggle to analyze complex, unstructured documents (including today's Gemini 1.5 Pro 002). We (UC Berkeley) built 📜DocETL, an open-source, low-code system for LLM-powered data processing:
Tweet media one
35
257
2K
@sh_reya
Shreya Shankar
12 hours
RT @ihower: I’m currently taking AI Evals for Engineers & PMs (Cohort 2) by @HamelHusain and @sh_reya. The course focuses on building an a….
0
1
0
@sh_reya
Shreya Shankar
15 hours
RT @clstrfq: Two projects are already being formed to apply the lessons from AI Evaluations for Engineers and PMs. This is the most compreh….
0
1
0
@sh_reya
Shreya Shankar
15 hours
RT @amrit_za: One of my favourite aspects of this course is that it suits many learning styles. If you're like me and prefer reading, the….
0
1
0
@sh_reya
Shreya Shankar
22 hours
RT @sh_reya: Our first DocETL paper has been accepted to VLDB 2025! DocETL is a system we’ve been building at Berkeley for reliable LLM-pow….
0
31
0
@sh_reya
Shreya Shankar
22 hours
RT @najibninaba: 7/ 🔑 MOST VALUABLE LESSON:. Stop obsessing over perfect prompts. Start by systematically analyzing WHERE and WHY they fai….
0
1
0
@sh_reya
Shreya Shankar
22 hours
RT @_Trang: What makes evals different is that you are taking a data science approach of deeply understanding your user's conversion histor….
0
1
0
@sh_reya
Shreya Shankar
22 hours
lesson 4 is a fun one. Hamel and I do a role playing bit. We do error analysis while applying a poorly defined rubric to grade traces (to demonstrate poor cohen’s kappa). I play the role of someone only spot checking the traces. Hamel plays the role of a better dev who.
@unthank
Dr. David Gérouville-Farrell
23 hours
Slowly working through my notes on @HamelHusain and @sh_reya r’s LLM Evals course today, covering lesson four on two topics:.
0
1
18
@sh_reya
Shreya Shankar
1 day
RT @Allmarsh713: Been in build mode the last few months, working on some new projects, gearing up to ship to real users in weeks. AI agents….
0
2
0
@sh_reya
Shreya Shankar
1 day
RT @gojira: Loved the "AI Evals for Engineers & PMs" course by @HamelHusain and @sh_reya. It takes “look at your d….
Tweet card summary image
maven.com
Learn proven approaches for quickly improving AI applications. Build AI that works better than the competition, regardless of the use-case.
0
3
0
@sh_reya
Shreya Shankar
2 days
RT @skylar_b_payne: "My biggest problem with evals? I have no idea where to start". Stop feeling overwhelmed with all the tools, models, et….
0
5
0
@sh_reya
Shreya Shankar
2 days
RT @jcardonnet: Last week I finished @HamelHusain and @sh_reya 's "AI Evals For Engineers & PMs" fantastic course. I expected up to date,….
Tweet card summary image
maven.com
Learn proven approaches for quickly improving AI applications. Build AI that works better than the competition, regardless of the use-case.
0
1
0
@sh_reya
Shreya Shankar
2 days
RT @glorioustango: Just finished "AI Evals For Engineers & PMs" with @HamelHusain and @sh_reya 🔥 and it was a total game-changer. Highly re….
Tweet card summary image
maven.com
Learn proven approaches for quickly improving AI applications. Build AI that works better than the competition, regardless of the use-case.
0
1
0
@sh_reya
Shreya Shankar
2 days
I just saw an AI-generated comment on an AI-generated summary of an AI-generated substack article on evals. The result? I want to escape this timeline.
26
8
128
@sh_reya
Shreya Shankar
2 days
RT @joshpitzalis: Just finished my second round of the AI Evals for Engineers & PMs course by @HamelHusain and @sh_reya — and it was worth….
0
1
0
@sh_reya
Shreya Shankar
3 days
No? It’s because they don’t invest in evals processes lol. AI can be extremely useful as a component in software without data retention or memory features.
@Mayhem4Markets
Markets & Mayhem 🤖
4 days
Wow. 95% of organizations investing in generative AI have little or nothing to show for it, per @MIT.
Tweet media one
2
5
71
@sh_reya
Shreya Shankar
4 days
RT @vishal_learner: i have been looking for an explanation/analogy like this one for task-decomposition for awhile. this totally makes sens….
0
1
0
@sh_reya
Shreya Shankar
4 days
I’ll be presenting the paper on Tuesday, Sep 2 at VLDB. You can read it at Follow along with the open source project at and This first paper was a super fun collaboration with Tristan, Tarak, Eugene, and my.
Tweet card summary image
github.com
A system for agentic LLM-powered data processing and ETL - ucbepic/docetl
1
2
19
@sh_reya
Shreya Shankar
4 days
For me, DocETL has been more than a lesson in how to build reliable AI-powered data systems. Bigger models aren’t enough; reliability comes from structuring tasks into pipelines, and from optimizers that can invent, rewrite, and validate those pipelines for us. But the DocETL.
1
1
9
@sh_reya
Shreya Shankar
4 days
On the Berkeley police misconduct dataset, DocETL’s optimizer automatically discovered pipelines that doubled recall compared to a strong human-engineered baseline. Pipelines that would have taken experts weeks of tinkering emerged automatically from the search. We are pretty.
1
0
8
@sh_reya
Shreya Shankar
4 days
So over the year we built DocETL: a system that lets users author pipelines of LLM-powered operators, with an optimizer that rewrites those pipelines top-down. Given a library of 13 open-ended rewrite directives — guidelines for how to logically decompose a pipeline — the.
1
0
7