Kaiyu Yang
@KaiyuYang4
Followers
5K
Following
1K
Media
30
Statuses
284
Research Scientist at @Meta Fundamental AI Research (FAIR), New York. Previously: Postdoc @Caltech, PhD @PrincetonCS, Undergrad @Tsinghua_Uni.
New York, NY
Joined June 2019
@daemonzhang6 That's the problem. People who are responsible for the issues are not the people who got laid off😅 In January, our team put down all the research we are currently doing, was (forced?) to move to GenAI <2 months before the llama 4 release deadline to help with all the
9
39
827
Several of my team members + myself are impacted by this layoff today. Welcome to connect :)
474
287
7K
✂️Introducing ProofOptimizer: a training and inference recipe for proof shortening! 😰AI-written formal proofs can be long and unreadable: Seed-Prover's proof of IMO '25 P1 is 16x longer in Lean vs. English. Our 7B shortens proofs generated by SoTA models by over 50%! 🧵⬇️
6
35
205
LLMs solving math benchmarks with verifiable answers like AIME? ✅ LLMs solving math proofs? ❌ Still an open problem. RL works great for final-answer problems, but proofs are different: - Often no single checkable answer - Correct answers can hide flawed reasoning The key
9
37
187
⏰ Only 2 days left to submit! The 5th MATH-AI Workshop @NeurIPSConf 2025 is calling for papers 📝 👉 https://t.co/qh1OEW0uXO 🧮 Topics: benchmarks, algorithms, models, theorem proving, education, applications & more. 📅 Deadline: Sept 26, 2025 (AoE) — 4 pages, non-archival
1
5
33
Introducing DeepConf: Deep Think with Confidence 🚀 First method to achieve 99.9% on AIME 2025 with open-source models! Using GPT-OSS-120B even without tools, we reached this almost-perfect accuracy while saving up to 85% generated tokens. It also delivers many strong
63
333
2K
The report of Goedel-Prover-V2 is on arXiv now https://t.co/yROjbJMVgP . Check out the details on self-correction, large scale scaffolded data sythesis framework, and the magical model averaging.
9
114
316
⏱️AI is making verification process easier, with models verifying proofs in minutes. 💻 Now, @prfsanjeevarora, @chijinML, @danqi_chen and @PrincetonPLI have released Goedel Prover V2, a model more efficient and more accurate than any previous model. 👉 https://t.co/v7500VNytz
1
22
94
Thanks to our dedicated teams of organizers: @HanSineng, @lupantech, @weixiong_1, @ericzelikman, @Yong18850571, @uniq_zz, Soonho Kong, @hhexiy, @dawnsongtweets, @prfsanjeevarora.
0
0
14
🤝 Seeking sponsors to support travel grants & recognize outstanding work at MATH‑AI! Interested in sponsoring? DM me for details.
0
0
8
🔍 We need reviewers to help maintain our scientific quality! If you're interested in reviewing MATH‑AI submissions, please sign up here: https://t.co/w8cmheI8yS. Reviewers play a vital role—thanks for your contributions!
docs.google.com
The Workshop on Mathematical Reasoning and AI (MATH-AI) at NeurIPS 2025 aims to bring together diverse participants from a wide range of backgrounds, institutions, and disciplines to explore a...
0
0
10
✍️ Submit your 4‑page, non‑archival workshop papers to MATH‑AI. 📅 Deadline (tentative): Aug 29, 2025 AoE 📌 Info & CFP: https://t.co/58JkYyG3TE 🔗 Submission via OpenReview: https://t.co/uexGr2UK2W All accepted papers will be presented as posters, with a few selected orals and
openreview.net
Welcome to the OpenReview homepage for NeurIPS 2025 Workshop MATH-AI
0
1
12
⭐️ We have a stellar lineup of speakers: * Swarat Chaudhuri (UT Austin & Google DeepMind) @swarat * Weizhu Chen (Microsoft) @WeizhuChen * Yejin Choi (Stanford & NVIDIA) @YejinChoinka * Hannaneh Hajishirzi (UW & AI2) * Heng Ji (UIUC) @hengjinlp * Chi Jin (Princeton) @chijinML *
0
3
23
🚀 Excited to share that the Workshop on Mathematical Reasoning and AI (MATH‑AI) will be at NeurIPS 2025! 📅 Dec 6 or 7 (TBD), 2025 🌴 San Diego, California
8
57
238
Lovely to see the impressive performance of the Seed Prover developed by the ByteDance Seed team at IMO 2025 — achieving a silver-level score (30 out of 42) within three days, and reaching (35 out of 42) with extended compute time.
leanprover.zulipchat.com
Browse the publicly accessible channels in Lean without logging in.
2
25
75
Another AI system, ByteDance's SeedProver solved 4 out of 6 IMO problems *with* Lean, and solved a fifth with extended compute. This is becoming routine, like when we went to the moon for the fourth time. There is *nothing* "routine" about this!!...
9
52
473
While IMO is trending, our model leads on college-level math (Putnam Benchmark)—nearly doubling the problems solved by prior SOTA, with formal, verifiable proofs! Moreover, it’s not just an announcement—you can actually download and use our model. 🙂
🔥Our Goedel-Prover-V2-32B topped the PutnamBench Leaderboard by solving 86 problems —nearly 2× more than the previous SOTA DeepSeek-Prover-V2-671B (solved 47), while using: * 1/20 the model size (32B vs. 671B) * 1/5 the passes (184 vs. 1024) Meanwhile, we also release *
4
23
169
Official results are in - Gemini achieved gold-medal level in the International Mathematical Olympiad! 🏆 An advanced version was able to solve 5 out of 6 problems. Incredible progress - huge congrats to @lmthang and the team!
deepmind.google
Our advanced model officially achieved a gold-medal level performance on problems from the International Mathematical Olympiad (IMO), the world’s most prestigious competition for young...
203
761
6K