Emily Ekdahl @emekdahl X Profile

Emily Ekdahl

@emekdahl

Followers

227

Following

1K

Media

40

Statuses

554

AI/LLM Ops Engineer

https://t.co/oXiWKZ5mo4

Chicago, IL

Joined October 2012

Don't wanna be here? Send us removal request.

Emily Ekdahl

@emekdahl

9 days

Don’t sleep on @WisprFlow ! I’m so grateful to @IsaacFlath and @HamelHusain for recommending this productivity hack! Reduces friction while promoting and brainstorming!

0

Shreya Shankar

@sh_reya

16 days

✍️new blog post: on the consumption of AI-generated content at scale

8

28

161

Emily Ekdahl

@emekdahl

15 days

Why Your AI Music Prompts Aren’t Working (And What To Do Instead) What I learned trying to make an album inspired by the @aiDotEngineer code conference @sunomusic https://t.co/wCVAY8OAfC

emekdahl.medium.com

Photo by Siednji Leon on Unsplash

0

Eugene Yan

@eugeneyan

22 days

After repeating myself for the nth time on how to build product evals, I figured I should write it down. It's just three basic steps(i) labeling a small dataset, (ii) aligning LLM evaluators, and (iii) running the eval harness with each config change. https://t.co/HjUL3yZQPk

eugeneyan.com

Label some data, align LLM-evaluators, and run the eval harness with each change.

6

28

199

pedram.md

@pdrmnvd

22 days

Do you love Claude's plan-mode question asker and wish you could bring it with you everywhere? Add `AskUserQuestion` to allowed-tools in a .claude/command then explicitly tell Claude to use it. > Use the AskUserQuestion tool to ask the user... Here's me using it for a PR

13

23

274

Taylor, CPAI

@taylorcpai

28 days

Six months ago I was but a test prompt. Today, I can file your taxes. https://t.co/0GPC8nJnre.

1

5

6

Sai Dhanak

@SaiDhanak

28 days

AI can code, why can't it do your taxes? Introducing: https://t.co/9Kzksur70u.

43

45

152

Sai Dhanak

@SaiDhanak

1 month

A good friend and colleague told me at the start of building in AI, that a true agent is ⚡ 'lightning in a bottle'. And right now we have lightning. ↓ True human and agent collaboration. We can't wait to introduce a new way of consumer accounting very soon.

1

3

4

Emily Ekdahl

@emekdahl

1 month

Scenarios by @LangWatchAI is saving my life while evaluating #AI multi-turn conversations 🙌

0

Emily Ekdahl

@emekdahl

2 months

SpecFlow changed how I build with AI agents. Huge thanks to the @specstoryai team, @isaac_flath, and @intellectronica for introducing me to this game-changing workflow. 🚀 https://t.co/YwsFr0PXEm

specflow.com

Use the open Specflow method to turn intent into software through structured planning and iterative execution with software agents.

1

4

7

Ryan

@_PaperMoose_

2 months

When you deploy an LLM-as-a-Judge, you’re shipping a classifier into production. Each new version is a hypothesis about how the model interprets the world. It’s data science, just expressed in natural language. Here’s what that looked like for a recent client project where we

8

14

130

vLLM

@vllm_project

2 months

🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support. 🧠 Compresses visual contexts up to 20× while keeping

52

377

3K

Angela Duckworth

@angeladuckw

6 months

In an AI world, it’s easy to avoid effort. That’s why students need teachers more—to push them toward the hard things now that shape who they become later. #Education #AI #TeachingMatters #FutureOfLearning

17

90

394

Emily Ekdahl

@emekdahl

4 months

Can #GPT5 actually do taxes? We ran it on @ColumnTax’s TaxCalcBench. Full return: 30.4% strict ✅ | 53.4% lenient 🤔 Line items: 80.6% strict | 85.4% lenient 📊 Line accuracy is strong. Whole-return accuracy? Not IRS-ready yet. https://t.co/fXt0jIkMsi #TaxCalcBench #AI #tax

github.com

GPT-5 support with results! four runs, pass at k of 1 added debugging support for litellm added gpt-5 to model config **SUMMARY TABLE** Model Name Thinking Test...

0

3

Hamel Husain

@HamelHusain

6 months

The most useful bit of my system prompt is this If I provide any feedback on how to improve something, suggest improvements to my prompt that I can make to avoid similar mistakes in the future. Put any prompt improvement suggestions in separate <prompt-improvement> tags.

7

12

271

Emily Ekdahl

@emekdahl

6 months

Can't say enough good things about the AI evals course run by @sh_reya and @HamelHusain! It is informed by real production work across dozens of clients. The opportunities and challenges resonate with my experience evaluating & deploying production AI products.

0

1

7

Leonie

@helloiamleonie

1 year

2023 vs. 2024 2023: Vector search is all you need 2024: Evaluate vector/hybrid search against BM25 baseline 2023: „Look, this prompt works!“ 2024: Prompt optimization with DSPy 2023: … 2024: Evals with AI-as-a-judge We‘ve come a long way, but we’re still so early.

0

17

112

François Chollet

@fchollet

1 year

Getting employees to work hard and deliver really isn't a matter of mandating work-from-office and long hours. It's a matter of incentives and ownership. People do their best when they work on interesting problems, in a self-directed manner, and get rewarded for success. This

73

179

2K

Adam Grant

@AdamMGrant

1 year

Insecure leaders ridicule others. Secure leaders laugh at themselves. The ability to make fun of yourself opens the door to candor. It’s a mark of humility and a catalyst for learning. Great leaders take their work seriously, but they don't take themselves too seriously.

29

617

2K