
Hao Peng
@haopeng_nlp
Followers
604
Following
39
Media
0
Statuses
38
RT @charlesfornlp: So many works talking about entropy, but what is the **mechanism** of entropy in RL for LLMs? š¤. Our work gives a princiā¦.
0
18
0
RT @Shivamag12: Can entropy minimization alone improve LLM performance? And how far can they go without any labeled data? This work answersā¦.
0
65
0
RT @saagnikkk: šØ Paper Alert: āRL Finetunes Small Subnetworks in Large Language Modelsā. From DeepSeek V3 Base to DeepSeek R1 Zero, a whoppā¦.
0
126
0
RT @zhaofeng_wu: š”We find that models āthinkā š in English (or in general, their dominant language) when processing distinct non-English orā¦.
0
61
0
RT @AkariAsai: šØ Iām on the job market this year! šØ.Iām completing my @uwcse Ph.D. (2025), where I identify and tackle key LLM limitationsā¦.
0
118
0
RT @OfirPress: I'm on the academic job market! .I develop autonomous systems for: programming, research-level question answering, finding sā¦.
0
39
0
RT @lifan__yuan: Wanna train PRMs but process labels, annotated manually or automatically, sound too expensive to youš? .Introduce Implicitā¦.
0
48
0
RT @bingyikang: Curious whether video generation models (like #SORA) qualify as world models?. We conduct a systematic study to answer thisā¦.
0
212
0
RT @MKhalifaaaa: What If LLMs can cite the pre-training source(s) supporting their parametric knowledge? Won't this dramatically improve veā¦.
arxiv.org
Large language models (LLMs) learn a vast amount of knowledge during pretraining, but they are often oblivious to the source(s) of such knowledge. We investigate the problem of intrinsic source...
0
15
0
RT @YangyiChen6666: šÆ Introducing SOLO, a single Transformer architecture for unified vision-language modeling. SOLO accepts both raw imageā¦.
0
53
0
Language models excel at undergraduate exams, but how do they fare in research? SciCode challenges models with real research coding problems. Even the best models solve less than 5%. Very proud of @MinyangTian1 and @luyu_gao for leading the charge!.
SciCode is our new benchmark that challenges LMs to code solutions for scientific problems from advanced papers. The challenges were crafted by PhDs;. ~10% of our benchmark is based on Nobel-winning research. GPT-4 and Sonnet 3.5 get <5% ACC. š§µ 1/6
0
0
11
RT @YueGuo10: I'm joining the UIUC @UofIllinois this fall as an Assistant Professor in the iSchool, with an affiliation in Computer Scienceā¦.
0
29
0
RT @Francis_YAO_: From Claude100K to Gemini10M, we are in the era of long context language models. Why and how a language model can utilizeā¦.
0
172
0
RT @zhaofeng_wu: Want to train an aligned LM in a new language š but donāt have preference data for training the reward model (RM)?. š” Justā¦.
0
37
0
RT @jyangballin: SWE-agent is our new system for autonomously solving issues in GitHub repos. It gets similar accuracy to Devin on SWE-bencā¦.
0
418
0
Very proud of Eurus. A huge shoutout to @lifan__yuan and @charlesfornlp for leading this!.
Introducing šEurus, a suite of state-of-the-art LLM reasoning generalists powered by a new member of Ultra-Series, UltraInteractš!. Particularly, Eurus-70B beats GPT-3.5 Turbo in reasoning through a comprehensive benchmarking across 12 tests (mostly OOD) covering five tasks!
0
2
20
Very proud of Eurus. A huge shoutout to @lifan__yuan and @charlesfornlp for leading this!.
This is a joint work with @charlesfornlp, @wanghanbin95, @stingning, @xingyaow_, Jia Deng, Boji Shan, Huimin Chen, Ruobing Xie, Yankai Lin, Zhenghao Liu, and advisors Bowen Zhou, @haopeng_nlp, @zibuyu9, Maosong Sun. cc @TsinghuaNLP @uiuc_nlp.
0
0
5
RT @Francis_YAO_: Frontier models all have at least 100k context length, Gemini 1.5 has even 1m context. What about research and open sourcā¦.
0
66
0
RT @xingyaow_: Large Language Model (LLM) agents promise to free us from mundane tasks, but how should they best interact with our world? Iā¦.
0
94
0
RT @xingyaow_: This a joint work with @YangyiChen6666 , @lifan__yuan , @YizheZhangNLP , @YunzhuLiYZ , @haopeng_nlp , and @elgreco_winter .ā¦.
arxiv.org
Large Language Model (LLM) agents, capable of performing a broad range of actions, such as invoking tools and controlling robots, show great potential in tackling real-world challenges. LLM agents...
0
2
0