Prateek Yadav
@prateeky2806
Followers
4K
Following
2K
Media
60
Statuses
1K
pre-training @AlatMeta, prev: part-time @GoogleDeepMind, PhD at @unccs
Sunnyvale, CA
Joined July 2014
Leaving Meta and PyTorch I'm stepping down from PyTorch and leaving Meta on November 17th. tl;dr: Didn't want to be doing PyTorch forever, seemed like the perfect time to transition right after I got back from a long leave and the project built itself around me. Eleven years
501
580
11K
Hire him, you won't face any skill issue going forward. I did my PhD along with YiLin and have learned a lot from him. A few months ago he was bombarded with good offers but now is on the market again.
Tough week! I also got impacted less than 3 months after joining. Ironically, I just landed some new RL infra features the day before. Life moves on. My past work spans RL, PEFT, Quantization, and Multimodal LLMs. If your team is working on these areas, I’d love to connect.
1
3
54
🚨 🤯 Wow! Yi Lin is an amazing researcher, who works on very hard and important problems in LLM and VLM training, RL, PEFT, Quantization, etc. -- ironically, he had several other top offers just a few months ago! Hire him ASAP if you want to pick up a top talent (and several
Tough week! I also got impacted less than 3 months after joining. Ironically, I just landed some new RL infra features the day before. Life moves on. My past work spans RL, PEFT, Quantization, and Multimodal LLMs. If your team is working on these areas, I’d love to connect.
5
29
159
We’re hiring AI researchers, engineers, growth, and interns at @PrimeIntellect Join us to build open superintelligence and make the stack accessible to everyone. • Member of Technical Staff - Agents • Member of Technical Staff - Full Stack • Member of Technical Staff - GPU
113
112
1K
The Meta layoffs include incredible talent. Having built startup myself, I know the same people can perform very differently depending on the environment. (My “raise $2B and start a frontier lab” tweet wasn’t entirely a joke.) If you were impacted, my DM is open. Let's chat!
Meta laid off 600 people from its Superintelligence Lab today. Many FAIR researchers, including FAIR Research Scientist Director Yuandong Tian, were affected. I think Yann Lecun will leave soon. Maybe I should raise $2B and start a new frontier lab with these folks.
21
20
296
Tough week! I also got impacted less than 3 months after joining. Ironically, I just landed some new RL infra features the day before. Life moves on. My past work spans RL, PEFT, Quantization, and Multimodal LLMs. If your team is working on these areas, I’d love to connect.
Meta has gone crazy on the squid game! Many new PhD NGs are deactivated today (I am also impacted🥲 happy to chat)
43
62
510
We're hiring for Research Scientists / Engineers! - We closely work with all frontier labs - We're a small org and can move fast - We can choose our own agenda and what we publish We're especially looking for people who enjoy fast empirical research. Deadline: 31 Oct!
16
69
723
I’m so sorry to all the folks who got laid off from FAIR yesterday. This can’t be easy. If you’re ready to move on and want to work on a small, extraordinary team building the future of AI evaluations, please message me here. It is a highly technical and important challenge
3
5
61
Several of my team members + myself are impacted by this layoff today. Welcome to connect :)
474
282
7K
While at Meta, I worked on this optimizer-wrapper (outer step lookahead momentum) we're calling Snoo ( https://t.co/SSZLcYNXzp). You can use it with AdamW or Muon and see really strong scaling. Here's a plot where we ran it against (tuned) AdamW up to 1e23 training flop scales.
5
21
232
Meet HoneyBee: Our new 2.5M sample multi-modal reasoning dataset. It outperforms InternVL2.5/3-Instruct and Qwen2.5-VL-Instruct. More details in this post!
New paper 📢 Most powerful vision-language (VL) reasoning datasets remain proprietary 🔒, hindering efforts to study their principles and develop similarly effective datasets in the open 🔓. Thus, we introduce HoneyBee, a 2.5M-example dataset created through careful data
0
2
11
I am pleased to announce our new paper, which provides an extremely sample-efficient way to create an agent that can perform well in multi-agent, partially-observed, symbolic environments. The key idea is to use LLM-powered code synthesis to learn a code world model (in the form
17
105
826
Congrats to @pingzli for pushing this through and getting this accepted.
🥳🥳 Excited to share that our work GLIDER (Global and Local Instruction-Driven Expert Router) has been accepted to #EMNLP2025 main conference! Our approach tackles a critical challenge in MoE routing: existing methods excel at either held-in OR held-out tasks, but rarely both.
0
0
3
The TPUs are ready, come make videos on @GeminiApp!
3 free video generations. 1 weekend only. https://t.co/x0H7mWra9m Ends Sunday 10pm PT.
29
31
508
Excited to share that our paper on efficient model development has been accepted to #EMNLP2025 Main conference @emnlpmeeting. Congratulations to my students @linusdd44804 and @Sub_RBala on their first PhD paper! 🎉
🚨 New paper 🚨 Excited to share my first paper w/ my PhD students!! We find that advanced LLM capabilities conferred by instruction or alignment tuning (e.g., SFT, RLHF, DPO, GRPO) can be encoded into model diff vectors (à la task vectors) and transferred across model
0
10
51
Jaemin is amazing and I would highly recommend applying for a PhD with him.
🥳 Gap year update: I'll be joining @allen_ai/@UW for 1 year (Sep2025-Jul2026 -> @JHUCompSci) & looking forward to working with amazing folks there, incl. @RanjayKrishna, @HannaHajishirzi, Ali Farhadi. 🚨 I’ll also be recruiting PhD students for my group at @JHUCompSci for Fall
1
3
10
@GoogleDeepMind India 🇮🇳 & Japan 🇯🇵 are looking for strong candidates in multilinguality, multicultural, & multimodality areas. RS Bangalore: https://t.co/6df1aak2qK RS Tokyo: https://t.co/VCHoCZ1Ehb RE Tokyo:
2
24
155