Yegor Denisov-Blanch
@yegordb
Followers
4K
Following
1K
Media
60
Statuses
494
Stanford | Research: Software Engineering Productivity | 8th grade dropout | ex-Olympic Weightlifting National Champion (Master of Sport)
Stanford, CA
Joined February 2021
I’m at Stanford and I research software engineering productivity. We have data on the performance of >50k engineers from 100s of companies. Inspired by @deedydas, our research shows: ~9.5% of software engineers do virtually nothing: Ghost Engineers (0.1x-ers)
751
1K
13K
Had a great time at NeurIPS talking about our position paper at the poster session -- and excited that the work also got an oral slot!
1
0
3
@YuningShen1 Interesting position paper: Machine learning conferences should establish a refutation and critiques track. By @yegordb. Have you ever struggled to reproduce prior work, only for the authors to never reply to your email? This paper proposes that conferences create a refutation
1
1
3
Excited to finally release the report @OpenRouterAI and I have been working on. The past year marked a decisive shift in how we build and use AI. Reasoning moved from the edges of research into the center of real-world production, driven by breakthroughs in model capability,
2
3
11
some cool insights from @yegordb ‘s talk on AI’s impact on developer productivity at @aiDotEngineer 👇 1. The most productive teams are pulling away from the median. The rich get richer
8
7
53
An LLM-generated paper is in the top 17% of ICLR submissions in terms of average reviewer score, having received two 8's. The paper has tons of BS jargon and hallucinated references. Fortunately, one reviewer actually looked at the paper and gave it a zero. 1/3
40
150
1K
Our position paper "Machine Learning Conferences Should Establish a 'Refutations and Critiques' Track" was accepted at #NeurIPS2025 and will be appearing as part of a panel discussion! 1/3
New position paper! Machine Learning Conferences Should Establish a “Refutations and Critiques” Track Joint w/ @sanmikoyejo @JoshuaK92829 @yegordb @bremen79 @koustuvsinha @in4dmatics @JesseDodge @suchenzang @BrandoHablando @MGerstgrasser @is_h_a @ObbadElyas 1/6
5
13
170
We tested how autonomous AI agents perform on real software tasks from our recent developer productivity RCT. We found a gap between algorithmic scoring and real-world usability that may help explain why AI benchmarks feel disconnected from reality.
We ran a randomized controlled trial to see how much AI coding tools speed up experienced open-source developers. The results surprised us: Developers thought they were 20% faster with AI tools, but they were actually 19% slower when they had access to AI than when they didn't.
17
77
559
This phenomenon highlights that there are individuals at every company who work multiple jobs just like Soham did. I want to highlight that the above estimate uses certain assumptions that can sway the numbers 30%+ in either direction. I will be writing a short paper with a
0
0
5
We found that people working multiple jobs performed significantly worse than their peers: 0.62x the global median, on average.
1
0
9
Why might our initial findings underestimate the true rate? In our initial sample of 128.5K developers, we identified only 63 holding multiple jobs (0.05%). However, this significantly underestimates reality, as our dataset includes only 1.20% (957 out of 80,000+) of global
1
0
6
How did we determine if someone was overemployed? We matched individuals across companies using employer-provided data, including names, job titles, locations, GitHub IDs, emails, employment status (full-time vs. contractor), and additional encrypted identifiers.
1
0
7
How did we calculate this? We tracked engineers actively committing code to multiple companies' repositories across multiple months. We excluded contractors, part-timers, and people switching jobs. We then extrapolated these findings to estimate global numbers.
1
0
7
🧵How many software engineers secretly work 2+ jobs? My research group at Stanford has access to private code repos covering 100K+ engineers at ~1,000 companies: ~0.5% of the world’s developers. Our data shows that 4.1% of software engineers are working 2+ jobs.
4
24
82
New position paper! Machine Learning Conferences Should Establish a “Refutations and Critiques” Track Joint w/ @sanmikoyejo @JoshuaK92829 @yegordb @bremen79 @koustuvsinha @in4dmatics @JesseDodge @suchenzang @BrandoHablando @MGerstgrasser @is_h_a @ObbadElyas 1/6
12
57
434
Here is a related post, where we found that almost 10% of all software engineers are "Ghost Engineers" - doing no work. https://t.co/xELmJPZkfb
I’m at Stanford and I research software engineering productivity. We have data on the performance of >50k engineers from 100s of companies. Inspired by @deedydas, our research shows: ~9.5% of software engineers do virtually nothing: Ghost Engineers (0.1x-ers)
1
2
33
My research group at Stanford has access to private code repos from 100K+ engineers at almost 1,000 companies, ~0.5% of the world’s developers. Within this "small" sample, we routinely find engineers working 2+ jobs. I estimate that easily >5% of all engineers are working 2+
PSA: there’s a guy named Soham Parekh (in India) who works at 3-4 startups at the same time. He’s been preying on YC companies and more. Beware. I fired this guy in his first week and told him to stop lying / scamming people. He hasn’t stopped a year later. No more excuses.
22
22
377
Third #ICML2025 paper! What effect will web-scale synthetic data have on future deep generative models? Collapse or Thrive? Perils and Promises of Synthetic Data in a Self-Generating World 🔄 @JoshuaK92829 @ApratimDey2 @MGerstgrasser @rm_rafailov @sanmikoyejo 1/7
4
24
113
How can we close the generation-verification gap when LLMs produce correct answers but fail to select them? 🧵 Introducing Weaver: a framework that combines multiple weak verifiers (reward models + LM judges) to achieve o3-mini-level accuracy with much cheaper non-reasoning
11
65
224
🚨New preprint 🚨 Turning Down the Heat: A Critical Analysis of Min-p Sampling in Language Models We examine min-p sampling (ICLR 2025 oral) & find significant problems in all 4 lines of evidence: human eval, NLP evals, LLM-as-judge evals, community adoption claims 1/8
12
40
288
Excited to be speaking in the AI Architects track of @aiDotEngineer in June!
Announcing our speakers for the AI Architects track -- 1 of 2 tracks as part of our exclusive Leadership Track! ⚠️PSA: Tix nearly sold out, get em here: https://t.co/1CtDfZcJQl An incredible lineup of speakers at this track, across two days, featuring: @claybavor, Co-Founder
2
0
5