David JH Wu, MD @davidjhwu X Profile

David JH Wu, MD

@davidjhwu

Followers

3K

Following

8K

Media

570

Statuses

3K

@stanfordradonc resident, previously @scvmed intern. research in healthcare + ai intersection

https://t.co/j9pwtmXmxW

Cupertino, CA

Joined April 2012

Don't wanna be here? Send us removal request.

David JH Wu, MD

@davidjhwu

1 month

full text here: https://t.co/rgc0BQ1JRn My thoughts on this are still evolving. In oncology, our 100m dash is the Kaplan-Meier overall survival curve (IMHO). That's the best way to get docs to align. Recently gave a talk on this, I will share with the greater internet soon(2/2)

0

1

David JH Wu, MD

@davidjhwu

1 month

Wanted to share some more personal reflections on what it's like to build, evaluate, and research some of these frontier models with real-world, de-identified patient cases. I think it's in these messy, non-polished cases where you can really see intelligence shine (or not) (1/2)

1

0

2

Sai Balasubramanian, M.D., J.D.

@saibalamdjd

1 month

It was great to learn about what @drethangoh, @AdamRodmanMD & @jonc101x are building at ARiSE. My latest for @Forbes ✍️🚀 @Stanford @StanfordDeptMed @Harvard @erichorvitz, @andrewparsonsMD, @andrewolsonmd, @vishnuravi @LiamGMcCoy @davidjhwu & many more! https://t.co/W4E8A2Qs6I

0

5

8

David JH Wu, MD

@davidjhwu

2 months

Great thread by one of the most talented triple-threat physician-coder-catlover I have ever had the pleasure of working with!! Terrific work on some much needed eval of clinical reasoning in LLMs

Liam McCoy, MD MSc

@LiamGMcCoy

2 months

Are language models overconfident in their clinical reasoning? Have reasoning optimizations made this problem worse? Our latest in NEJM AI - we adapt a human-validated, trivially scalable automated benchmark to get to the heart of clinical reasoning in AI systems

0

3

David JH Wu, MD

@davidjhwu

2 months

Cancer decision making is inherently multimodal but there is a lag in how we have adopted imaging into medical AI vs text

0

David JH Wu, MD

@davidjhwu

2 months

1st talk: Great talk by the legendary @mattlungrenMD about where we are today and some unique insight about Microsoft’s role in medical AI. Fun fact: 95% of EHR users use Microsoft. Could be an easy point of entry into healthcare, eg tumor boards held on Teams

1

0

1

David JH Wu, MD

@davidjhwu

2 months

Big day today! Welcome to the Symposium on AI in Medicine hosted by the @StanfordRadOnc department! Check out our awesome speaker lineup! Will tweet some highlights throughout the day 🌊

0

4

David JH Wu, MD

@davidjhwu

3 months

Sigh Got the book

0

3

David JH Wu, MD

@davidjhwu

3 months

@drethangoh @vishnuravi @jonc101x @LiamGMcCoy @fatemenateghi #AIinMedicine #digitalhealth #healthinformatics #MaML #healthai

0

1

David JH Wu, MD

@davidjhwu

3 months

Much thanks to my amazing collaborators @drethangoh @vishnuravi @jonc101x @LiamGMcCoy @fatemenateghi

1

0

5

David JH Wu, MD

@davidjhwu

3 months

This small pilot (n=40) opens the door for more rigorous, large-scale real-world evaluation of medical AI. Instead of just testing on clean vignettes, we can assess how AI performs on the actual messy complexity of clinical practice in a scalable way. https://t.co/EUjwdxuKdy

medrxiv.org

Specialist consults in primary care and inpatient settings typically address complex clinical questions beyond standard guidelines. eConsults have been developed as a way for specialist physicians to...

1

3

David JH Wu, MD

@davidjhwu

3 months

Both compared AI answers to actual specialist responses. Key finding: LLM-as-judge matched human physician performance when assessing clinical concordance. This suggests we can scalably benchmark AI for complex medical tasks that is more considerate of physician reviewers.

1

5

David JH Wu, MD

@davidjhwu

3 months

The challenge? How do you evaluate AI performance on these real-world consults at scale? We can't have human physicians manually review every AI response so we tested two automated evaluation methods: LLM-as-judge Decompose-then-verify: Breaking down responses into atomic facts

1

0

2

David JH Wu, MD

@davidjhwu

3 months

Real-world cases are messy, often incomplete. At @Stanford we have collected a unique dataset: thousands of real physician-to-physician eConsult cases. Think actual specialists answering complex questions from primary care docs—not polished, textbook scenarios. (2/6)

1

0

4

David JH Wu, MD

@davidjhwu

3 months

Excited to share some of my recent work! Y'all have probably seen that there's been a lot of hype around LLMs acing the USMLE and medical vignettes. But here's the thing: standardized tests ≠ real-world medicine! How does AI do in the real-world and how can we eval it? Read on

1

6

25

David JH Wu, MD

@davidjhwu

4 months

Buying API credits off of Anthropic really has changed my perspective on what $5 is. I’ll never be able to look at a cup of coffee the same any more. I’ll take the $5 API credit over a latte ANY DAY

2

1

7

David JH Wu, MD

@davidjhwu

5 months

Wrote up some of my thoughts and reflections from this past year’s #AIMI25. Enjoy! https://t.co/eOEyuqQ9re

0

2

David JH Wu, MD

@davidjhwu

5 months

Great panel on publishing health in Health AI at #AIMI25 featuring @charlottehaug (NEJM AI), @yhswen (JAMA+AI), and Chris Paton (BMJ) hosted by @drnigam “What do editors want?” - Prefer something prospective, reproducible, and useful today. “Do science the standard way”

0

1

5

David JH Wu, MD

@davidjhwu

5 months

Some interesting quotes: “These past few months we’ve seen the same amount of progress in model developments as the past few years” “Fine-tuning on medical tasks usually causes loss of other skills. Training pipeline is delicate. Risk of catastrophic forgetting.”

1

0

1

David JH Wu, MD

@davidjhwu

5 months

Packed house at #AIMI25 for this star-studded panel featuring @Emily_Alsentzer (Stanford CS), @thekaransinghal (OpenAI), @KhaledSaab11 (DeepMind) and @marinkazitnik (Harvard Biomedical Informatics) and @drethangoh 🎯: foundational model roadmap for health AI

1

7

55