Shreya Shankar @sh_reya X Profile

Shreya Shankar

@sh_reya

Followers

48K

Following

14K

Media

388

Statuses

5K

doing a PhD @Berkeley_EECS, building https://t.co/PmuOqAYt6q | teaching https://t.co/CTWJ6z0JEg | formerly ML eng & undergrad @Stanford CS

Berkeley, CA

Joined January 2014

Don't wanna be here? Send us removal request.

Shreya Shankar

@sh_reya

11 months

LLMs have made exciting progress on hard tasks! But they still struggle to analyze complex, unstructured documents (including today's Gemini 1.5 Pro 002). We (UC Berkeley) built 📜DocETL, an open-source, low-code system for LLM-powered data processing:

35

257

2K

Shreya Shankar

@sh_reya

12 hours

RT @ihower: I’m currently taking AI Evals for Engineers & PMs (Cohort 2) by @HamelHusain and @sh_reya. The course focuses on building an a….

0

1

0

Shreya Shankar

@sh_reya

15 hours

RT @clstrfq: Two projects are already being formed to apply the lessons from AI Evaluations for Engineers and PMs. This is the most compreh….

0

1

0

Shreya Shankar

@sh_reya

15 hours

RT @amrit_za: One of my favourite aspects of this course is that it suits many learning styles. If you're like me and prefer reading, the….

0

1

0

Shreya Shankar

@sh_reya

22 hours

RT @sh_reya: Our first DocETL paper has been accepted to VLDB 2025! DocETL is a system we’ve been building at Berkeley for reliable LLM-pow….

0

31

0

Shreya Shankar

@sh_reya

22 hours

RT @najibninaba: 7/ 🔑 MOST VALUABLE LESSON:. Stop obsessing over perfect prompts. Start by systematically analyzing WHERE and WHY they fai….

0

1

0

Shreya Shankar

@sh_reya

22 hours

RT @_Trang: What makes evals different is that you are taking a data science approach of deeply understanding your user's conversion histor….

0

1

0

Shreya Shankar

@sh_reya

22 hours

lesson 4 is a fun one. Hamel and I do a role playing bit. We do error analysis while applying a poorly defined rubric to grade traces (to demonstrate poor cohen’s kappa). I play the role of someone only spot checking the traces. Hamel plays the role of a better dev who.

Dr. David Gérouville-Farrell

@unthank

23 hours

Slowly working through my notes on @HamelHusain and @sh_reya r’s LLM Evals course today, covering lesson four on two topics:.

0

1

18

Shreya Shankar

@sh_reya

1 day

RT @Allmarsh713: Been in build mode the last few months, working on some new projects, gearing up to ship to real users in weeks. AI agents….

0

2

0

Shreya Shankar

@sh_reya

1 day

RT @gojira: Loved the "AI Evals for Engineers & PMs" course by @HamelHusain and @sh_reya. It takes “look at your d….

maven.com

Learn proven approaches for quickly improving AI applications. Build AI that works better than the competition, regardless of the use-case.

0

3

0

Shreya Shankar

@sh_reya

2 days

RT @skylar_b_payne: "My biggest problem with evals? I have no idea where to start". Stop feeling overwhelmed with all the tools, models, et….

0

5

0

Shreya Shankar

@sh_reya

2 days

RT @jcardonnet: Last week I finished @HamelHusain and @sh_reya 's "AI Evals For Engineers & PMs" fantastic course. I expected up to date,….

maven.com

Learn proven approaches for quickly improving AI applications. Build AI that works better than the competition, regardless of the use-case.

0

1

0

Shreya Shankar

@sh_reya

2 days

RT @glorioustango: Just finished "AI Evals For Engineers & PMs" with @HamelHusain and @sh_reya 🔥 and it was a total game-changer. Highly re….

maven.com

Learn proven approaches for quickly improving AI applications. Build AI that works better than the competition, regardless of the use-case.

0

1

0

Shreya Shankar

@sh_reya

2 days

I just saw an AI-generated comment on an AI-generated summary of an AI-generated substack article on evals. The result? I want to escape this timeline.

26

8

128

Shreya Shankar

@sh_reya

2 days

RT @joshpitzalis: Just finished my second round of the AI Evals for Engineers & PMs course by @HamelHusain and @sh_reya — and it was worth….

0

1

0

Shreya Shankar

@sh_reya

3 days

No? It’s because they don’t invest in evals processes lol. AI can be extremely useful as a component in software without data retention or memory features.

Markets & Mayhem 🤖

@Mayhem4Markets

4 days

Wow. 95% of organizations investing in generative AI have little or nothing to show for it, per @MIT.

2

5

71

Shreya Shankar

@sh_reya

4 days

RT @vishal_learner: i have been looking for an explanation/analogy like this one for task-decomposition for awhile. this totally makes sens….

0

1

0

Shreya Shankar

@sh_reya

4 days

I’ll be presenting the paper on Tuesday, Sep 2 at VLDB. You can read it at Follow along with the open source project at and This first paper was a super fun collaboration with Tristan, Tarak, Eugene, and my.

github.com

A system for agentic LLM-powered data processing and ETL - ucbepic/docetl

1

2

19

Shreya Shankar

@sh_reya

4 days

For me, DocETL has been more than a lesson in how to build reliable AI-powered data systems. Bigger models aren’t enough; reliability comes from structuring tasks into pipelines, and from optimizers that can invent, rewrite, and validate those pipelines for us. But the DocETL.

1

9

Shreya Shankar

@sh_reya

4 days

On the Berkeley police misconduct dataset, DocETL’s optimizer automatically discovered pipelines that doubled recall compared to a strong human-engineered baseline. Pipelines that would have taken experts weeks of tinkering emerged automatically from the search. We are pretty.

1

0

8

Shreya Shankar

@sh_reya

4 days

So over the year we built DocETL: a system that lets users author pipelines of LLM-powered operators, with an optimizer that rewrites those pipelines top-down. Given a library of 13 open-ended rewrite directives — guidelines for how to logically decompose a pipeline — the.

1

0

7