KaiyuYang4 Profile Banner
Kaiyu Yang Profile
Kaiyu Yang

@KaiyuYang4

Followers
4K
Following
1K
Media
30
Statuses
272

Research Scientist at @Meta Fundamental AI Research (FAIR), New York. Previously: Postdoc @Caltech, PhD @PrincetonCS, Undergrad @Tsinghua_Uni.

New York, NY
Joined June 2019
Don't wanna be here? Send us removal request.
@KaiyuYang4
Kaiyu Yang
17 hours
šŸš€ Excited to share that the Workshop on Mathematical Reasoning and AI (MATH‑AI) will be at NeurIPS 2025!.šŸ“… Dec 6 or 7 (TBD), 2025.🌓 San Diego, California
Tweet media one
7
27
171
@KaiyuYang4
Kaiyu Yang
17 hours
Thanks to our dedicated teams of organizers: @HanSineng, @lupantech, @weixiong_1, @ericzelikman, @Yong18850571, @uniq_zz, Soonho Kong, @hhexiy, @dawnsongtweets, @prfsanjeevarora.
Tweet media one
0
0
11
@KaiyuYang4
Kaiyu Yang
17 hours
šŸ¤ Seeking sponsors to support travel grants & recognize outstanding work at MATH‑AI!.Interested in sponsoring? DM me for details.
0
0
7
@KaiyuYang4
Kaiyu Yang
17 hours
šŸ” We need reviewers to help maintain our scientific quality! If you're interested in reviewing MATH‑AI submissions, please sign up here: Reviewers play a vital role—thanks for your contributions!.
Tweet card summary image
docs.google.com
The Workshop on Mathematical Reasoning and AI (MATH-AI) at NeurIPS 2025 aims to bring together diverse participants from a wide range of backgrounds, institutions, and disciplines to explore a...
0
0
9
@KaiyuYang4
Kaiyu Yang
17 hours
āœļø Submit your 4‑page, non‑archival workshop papers to MATH‑AI. šŸ“… Deadline (tentative): Aug 29, 2025 AoE.šŸ“Œ Info & CFP: šŸ”— Submission via OpenReview: All accepted papers will be presented as posters, with a few selected orals and.
openreview.net
Welcome to the OpenReview homepage for NeurIPS 2025 Workshop MATH-AI
0
1
10
@KaiyuYang4
Kaiyu Yang
17 hours
ā­ļø We have a stellar lineup of speakers:.* Swarat Chaudhuri (UT Austin & Google DeepMind) @swarat.* Weizhu Chen (Microsoft) @WeizhuChen.* Yejin Choi (Stanford & NVIDIA) @YejinChoinka.* Hannaneh Hajishirzi (UW & AI2).* Heng Ji (UIUC) @hengjinlp.* Chi Jin (Princeton) @chijinML.*
Tweet media one
0
1
16
@KaiyuYang4
Kaiyu Yang
1 day
RT @WendaLi8: Lovely to see the impressive performance of the Seed Prover developed by the ByteDance Seed team at IMO 2025 — achieving a si….
Tweet card summary image
leanprover.zulipchat.com
Browse the publicly accessible channels in Lean without logging in.
0
22
0
@KaiyuYang4
Kaiyu Yang
1 day
RT @AlexKontorovich: Another AI system, ByteDance's SeedProver solved 4 out of 6 IMO problems *with* Lean, and solved a fifth with extended….
0
49
0
@KaiyuYang4
Kaiyu Yang
2 days
RT @chijinML: While IMO is trending, our model leads on college-level math (Putnam Benchmark)—nearly doubling the problems solved by prior….
0
21
0
@KaiyuYang4
Kaiyu Yang
2 days
RT @demishassabis: Official results are in - Gemini achieved gold-medal level in the International Mathematical Olympiad! šŸ† An advanced ver….
Tweet card summary image
deepmind.google
Our advanced model officially achieved a gold-medal level performance on problems from the International Mathematical Olympiad (IMO), the world’s most prestigious competition for young...
0
765
0
@KaiyuYang4
Kaiyu Yang
5 days
RT @alexwei_: 1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI….
0
1K
0
@KaiyuYang4
Kaiyu Yang
7 days
RT @_akhaliq: Goedel-Prover-V2. The Strongest Open-Source Theorem Prover to Date
Tweet media one
0
33
0
@KaiyuYang4
Kaiyu Yang
8 days
RT @Dorialexander: SOTA on PutnamBench with a 32b model (and highly competitive 8b): Goedel team is not messing around. Unsurprisingly mos….
0
5
0
@KaiyuYang4
Kaiyu Yang
9 days
RT @prfsanjeevarora: Formal math taking off at @PrincetonPLI ! New Goedel-Prover v2 8B model matches 2.5 month old Deepseek V2 prover 671B….
0
16
0
@KaiyuYang4
Kaiyu Yang
9 days
RT @chijinML: šŸš€ Huge milestone from our Goedel-Prover team: we’ve just released a new state-of-the-art model (8B & 32B) for automated theor….
0
10
0
@KaiyuYang4
Kaiyu Yang
9 days
Our Goedel-Prover-V2 doubled the SOTA Pass@32 performance on PutnamBench with a 20x smaller model, making it the strongest open-source theorem prover to date!.
@Yong18850571
Yong Lin
9 days
(1/4)🚨 Introducing Goedel-Prover V2 🚨.šŸ”„šŸ”„šŸ”„ The strongest open-source theorem prover to date. šŸ„‡ #1 on PutnamBench: Solves 64 problems—with far less compute. 🧠 New SOTA on MiniF2F:.* 32B model hits 90.4% at Pass@32, beating DeepSeek-Prover-V2-671B’s 82.4%. * 8B > 671B: Our 8B
Tweet media one
Tweet media two
0
14
88
@KaiyuYang4
Kaiyu Yang
14 days
RT @noahdgoodman: So proud! . Go work with Gabriel, he’ll be the best advisor.
0
5
0
@KaiyuYang4
Kaiyu Yang
22 days
RT @swarat: Passionate about frontier AI models, classical symbolic reasoning, and safe/secure software? Consider applying for this positio….
job-boards.greenhouse.io
0
15
0
@KaiyuYang4
Kaiyu Yang
1 month
RT @dawnsongtweets: 1/ šŸ”„ AI agents are reaching a breakthrough moment in cybersecurity. In our latest work:. šŸ”“ CyberGym: AI agents discov….
0
141
0
@KaiyuYang4
Kaiyu Yang
1 month
With LLMs increasingly used in software development, the bottleneck will move from writing code to reasoning about code (review, testing, debugging, and verification). Dynamically typed languages like Python are popular because they made code easy to write. However, the future.
@0xlf_
Zhe Ye
1 month
1/🧵Introducing VERINA: a high-quality benchmark for verifiable code generation. As LLMs are increasingly used to generate software, we need more than just working code--We need formal guarantees of correctness. VERINA offers a rigorous and modular framework for evaluating LLMs.
1
3
34