
Pang Wei Koh
@PangWeiKoh
Followers
4K
Following
2K
Media
12
Statuses
344
Assistant professor at @uwcse and visiting research scientist at @allen_ai. Formerly @StanfordAILab @GoogleAI @Coursera. šøš¬
Joined June 2020
RT @allen_ai: New updates for olmOCR, our fully open toolkit for transforming documents (PDFs & images) into clean markdown. We released:ā¦.
0
40
0
RT @lm4sci: šØ Call for Papers: LM4Sci @COLM_conf 2025 šØ. Excited to announce the Large Language Modeling for Scientific Discovery (LM4Sci)ā¦.
0
8
0
RT @RulinShao: šOur Spurious Rewards is available on ArXiv! We added experiments on.- More prompts/steps/models/analysis. - Spurious Promā¦.
0
40
0
LMs will be much more reliable if we could control the information used in their responses -- a list of approved drugs, facts about a safety recall, etc. -- and not add anything extraneous. Check out @jcqln_h's work on Precise Information Control below!.
LMs often output answers that sound right but arenāt supported by input context. This is intrinsic hallucination: the generation of plausible, but unsupported content. We propose Precise Information Control (PIC): a task requiring LMs to ground only on given verifiable claims.
0
1
14
RT @jcqln_h: LMs often output answers that sound right but arenāt supported by input context. This is intrinsic hallucination: the generatiā¦.
0
18
0
RT @niloofar_mire: š£Thrilled to announce Iāll join Carnegie Mellon University (@CMU_EPP & @LTIatCMU) as an Assistant Professor starting Falā¦.
0
63
0
RT @yizhongwyz: Thrilled to announce that I will be joining @UTAustin @UTCompSci as an assistant professor in fall 2026! . I will continueā¦.
0
54
0
RT @AkariAsai: āBold,ā āpositiveā and āunparalleledā: Allen School Ph.D. graduates Ashish Sharma and Sewon Min recognized with ACM Doctoralā¦.
0
16
0
RT @zzlccc: We do appreciate their efforts in writing the criticisms, but āturns out that the results in this paper are misreportedā is a sā¦.
0
10
0
RT @RulinShao: One more fun thing! .RLVR can elicit existing behaviors like code reasoning. But! If your model is not good at code but thouā¦.
0
24
0
RT @StellaLisy: 𤯠We cracked RLVR with. Random Rewards?!.Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by:.- Randoā¦.
0
338
0
Turns out that RL on "verifiable rewards" can work really well even when these rewards are completely random -- but even then, only on some model families! There's still much to understand about RLVR. Check out our analysis on spurious rewards below:.
𤯠We cracked RLVR with. Random Rewards?!.Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by:.- Random rewards: +21%.- Incorrect rewards: +25%.- (FYI) Ground-truth rewards: + 28.8%.How could this even workāļø Here's why: š§µ.Blogpost:
1
4
59
RT @rui_xin31: Think PII scrubbing ensures privacy? š¤Think againā¼ļø In our paper, for the first time on unstructured text, we show that youā¦.
0
19
0
RT @RulinShao: Super excited to see how ReasonIR data can also help much much smaller models to achieve high reasoning-intensive retrievalā¦.
0
5
0
RT @SitingLi627: Excited to share that our paper "Exploring How Generative MLLMs Perceive More Than CLIP with the Same Vision Encoder" is aā¦.
0
13
0
RT @percyliang: What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entireā¦.
0
193
0
RT @tomchen0: LLMs naturally memorize some verbatim of pre-training data. We study whether post-training can be an effective way to mitigatā¦.
0
31
0
RT @thao_nguyen26: š¢ Announcing our data-centric workshop at ICML 2025 on unifying data curation frameworks across domains!. š
Deadline: Maā¦.
0
21
0
RT @Muennighoff: Reasoning & test-time scaling don't just matter for generating text with LLMs ā @RulinShao, @ray_qiaorui & team show how tā¦.
0
8
0
RT @RulinShao: Meet ReasonIR-8BāØthe first retriever specifically trained for reasoning tasks! Our challenging synthetic training data unlocā¦.
0
62
0