JIRIGESI Profile Banner
Jiri Gesi ✈️NeurIPS ✈️ Profile
Jiri Gesi ✈️NeurIPS ✈️

@JIRIGESI

Followers
175
Following
656
Media
10
Statuses
144

Post training @amazon, previous @UCIrvine

Joined September 2015
Don't wanna be here? Send us removal request.
@dakuowang
Dakuo Wang
6 days
Our team will be giving a demo on “LLM Agent as Digital Twins of Online Shopping Customers” at #NeurIPS2025 . This is a collaboration between @amazon and @Northeastern human-centered AI lab. We are actively hiring PhD, postdoc, interns. Stop by the Amazon booth tomorrow
2
4
16
@JIRIGESI
Jiri Gesi ✈️NeurIPS ✈️
12 days
I’ll be at NeurIPS, if you’re interested in a 2026 PhD research internship with Amazon Store Foundation AI and want to work on agents, RL, and multi-modal, I’d love to connect at the conference.
3
1
17
@karpathy
Andrej Karpathy
2 months
Hah judging by mentions overnight people seem to find the ghost analogy provocative. I swear I don't wake up just trying to come with new memes but to elaborate briefly why I thought it was a fun comparison: 1) It captures the idea that LLMs are purely digital artifacts that
88
79
1K
@Lyubh22
Bohan Lyu
2 months
Building upon Goedel-Prover-V2, Hilbert Prover achieved 99.2% on Minif2f and solved over 70% PutnamBench problems😱 Amazing news from my old home @yuqirose's lab. At ICML this year, someone asked why the model struggled with Putnam problems. I said it was a matter of time, and
0
4
17
@JIRIGESI
Jiri Gesi ✈️NeurIPS ✈️
2 months
🪢 Careful SFT + 🧩 token-adaptive weighting helps avoid catastrophic forgetting 🧠
@jclin808
Jiacheng Lin
2 months
📉SFT might not suffer as much catastrophic forgetting as you think. Lately, much debate around GRPO in the community. RL is hot—but let’s not forget, in the context of LLMs: SFT is the bedrock of almost all RL. Also, there’s still a lot we don’t fully understand about SFT.
0
0
0
@Yong18850571
Yong Lin
4 months
The report of Goedel-Prover-V2 is on arXiv now https://t.co/yROjbJMVgP . Check out the details on self-correction, large scale scaffolded data sythesis framework, and the magical model averaging.
9
107
308
@chijinML
Chi Jin
4 months
Many friends still ask me about AI for IMO, formal vs informal math. Some quick thoughts: IMO results: GDM and OpenAI achieved gold using informal (natural language) methods. ByteDance and AlphaProof (last year) got gold/silver using formal methods (Lean + specialized geometry
12
40
372
@PrincetonCS
Princeton Computer Science
5 months
⏱️AI is making verification process easier, with models verifying proofs in minutes. 💻 Now, @prfsanjeevarora, @chijinML, @danqi_chen and @PrincetonPLI have released Goedel Prover V2, a model more efficient and more accurate than any previous model. 👉 https://t.co/v7500VNytz
1
21
96
@webagentlab
WebAgentlab
5 months
Shop-R1: Rewarding LLMs to Simulate Human Behavior in Online Shopping via Reinforcement Learning The paper introduces Shop-R1, a reinforcement learning framework that significantly enhances the simulation of realistic online shopping behaviors using Large Language Models by
1
1
2
@chijinML
Chi Jin
5 months
While IMO is trending, our model leads on college-level math (Putnam Benchmark)—nearly doubling the problems solved by prior SOTA, with formal, verifiable proofs! Moreover, it’s not just an announcement—you can actually download and use our model. 🙂
@Yong18850571
Yong Lin
5 months
🔥Our Goedel-Prover-V2-32B topped the PutnamBench Leaderboard by solving 86 problems —nearly 2× more than the previous SOTA DeepSeek-Prover-V2-671B (solved 47), while using: * 1/20 the model size (32B vs. 671B) * 1/5 the passes (184 vs. 1024) Meanwhile, we also release *
4
23
168
@chijinML
Chi Jin
5 months
Congrats! As a scientist/mathematician trained to verify things rigorously, I'm curious—will we get to see a bit more than tweets and final outputs (e.g., how they were generated/selected) to verify the claims? 🙂
@alexwei_
Alexander Wei
5 months
1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).
4
2
106
@chijinML
Chi Jin
5 months
I will also give a talk about theorem proving and Goedel-prover V2 at 12:45 today at @ai4mathworkshop . Drop by our talk and poster if you are at ICML!
@Lyubh22
Bohan Lyu
5 months
Goedel Prover V2 ( https://t.co/Xewuj90yGf) will be featured at @ai4mathworkshop today. Come and discuss with us!
0
8
30
@JIRIGESI
Jiri Gesi ✈️NeurIPS ✈️
5 months
Shot out for the best theorem prover model to date!
@Yong18850571
Yong Lin
5 months
(1/4)🚨 Introducing Goedel-Prover V2 🚨 🔥🔥🔥 The strongest open-source theorem prover to date. 🥇 #1 on PutnamBench: Solves 64 problems—with far less compute. 🧠 New SOTA on MiniF2F: * 32B model hits 90.4% at Pass@32, beating DeepSeek-Prover-V2-671B’s 82.4%. * 8B > 671B: Our 8B
0
0
0
@deepseek_ai
DeepSeek
11 months
🚀 DeepSeek-R1 is here! ⚡ Performance on par with OpenAI-o1 📖 Fully open-source model & technical report 🏆 MIT licensed: Distill & commercialize freely! 🌐 Website & API are live now! Try DeepThink at https://t.co/v1TFy7LHNy today! 🐋 1/n
2K
7K
36K
@Alibaba_Qwen
Qwen
1 year
Qwen2.5 Technical Report https://t.co/09b9WvA9pY
13
236
1K
@COLM_conf
Conference on Language Modeling
1 year
Announcement #1: our call for papers is up! 🎉 https://t.co/o8Mv1ywQwZ And excited to announce the COLM 2025 program chairs @yoavartzi @eunsolc @RanjayKrishna and @AdtRaghunathan
1
43
163
@natolambert
Nathan Lambert
1 year
First slide deck for NeurIPS is done -- a short overview of how I view post-training for applications. A higher level summary on the key decisions along the way of scoping a problem, choosing a base model, optimization algorithm, etc. (Plus some thoughts on OpenAI's RL
4
31
285
@gneubig
Graham Neubig
1 year
We are now done with all classes for CMU CS11-711 Advanced NLP! Slides: https://t.co/zY0CRx4NVw Videos: https://t.co/FZt0FLv6v4 Hope this is useful to people 😀
Tweet card summary image
youtube.com
Videos for Carnegie Mellon University's CS 11-711 Advanced NLP by Graham Neubig. Class Site: https://phontron.com/class/anlp-fall2024/
@gneubig
Graham Neubig
1 year
We started the Fall 2024 version of CMU CS11-711 Advanced NLP🎓 Follow along to learn about the latest in NLP, LLMs, Agents, etc. * Materials: https://t.co/LETEcVsBJl * Videos:
6
91
480
@stanfordnlp
Stanford NLP Group
1 year
Great article about our newest @stanfordnlp faculty member @Diyi_Yang in @Stanford Report: “I am passionate about developing a future where humans and AIs can collaborate to achieve greater collective intelligence in a variety of contexts, education, healthcare, & the workplace”
4
35
222
@ShunyuYao12
Shunyu Yao
1 year
Had a fun time delivering language agent tutorial ( https://t.co/UlDj9S4BfC) with @ysu_nlp @Diyi_Yang @taoyds @emnlpmeeting ! Thanks for joining and asking good qs!
5
16
219