Joy Wongkamjan
@joywwong
Followers
306
Following
3K
Media
70
Statuses
14K
PhD student @ UMD, super interest in human-AI interaction in any language tasks!
Los Angeles, California
Joined October 2012
Our paper CTRL-D is accepted to ACL Findings and will be presented at ACL 2025! šļøPoster session: 18:00ā19:30 (Level 0 Exhibit Halls X4/X5) Iām sad I canāt be there, but Jordan (@boydgraber) will! Youāll enjoy learning about CTRL-D from him. Now⦠what is CTRL-D? š
3
9
14
šØ My last paper as a postdoc is out! Practical Alignment: Why Learning from Human Feedback Alone Is Not Enough for Safe and Helpful AI. This paper started from a simple observation: Humans have incorrect beliefs about the world ā and todayās alignment methods pretend they
3
8
50
The toughest moment in a PhD: >spend a year building smth youāre proud of >travel across world with advisorsā support to share it >when your moment finally comes >your mic gets cut because the previous schedule ran late Heartbroken, but thanks to whom stayed for me @NeurIPSConf
@breadli428 doing a great job finishing his presentation after everyone got kicked out of the room at the Embodied World Models Workshop @NeurIPSConf three minutes into his talk... #NeurIPS2025
35
71
3K
Paper primarily from @Princeton and @UofIllinois! š
Banger paper from Stanford University on latent collaboration just dropped and it changes how we think about multi-agent intelligence forever. "Latent Collaboration in Multi-Agent Systems" shows that agents can coordinate without communication channels, predefined roles, or any
11
13
126
The MTI-LLM Workshop at @NeurIPSConf 2025 is coming next Saturday. We know agents need to work over long horizons. But we still lack a clear roadmap for agents that can work robustly over extremely long horizons. How do we scale RL training algorithms for long-term planning? How
2
15
44
š Stoked this is finally out! Large scale meta learning allowed us to discover an algorithm that outperforms human-crafted ones on a range of RL benchmarks, including ones that we didnāt train on. Check it out š¤
Excited to announce that our work on āDiscovering state-of-the-art RL algorithmsā is finally published in @Nature! In this work, we meta-learned RL algorithms at scale. Paper: https://t.co/3V4TmPTWm4 Blog: https://t.co/G65ReK2iMs See thread š
0
2
21
Attention: Google has dropped a new version of Attention Is All You Need!
57
195
4K
Brutal honesty is what you want from a great research advisor
@TheVixhal your post challenged me. every one of your points is wrong but i had to think about each for a while :)
33
105
4K
I'm recruiting my first group of PhD students at TTIC! If you're interested, please apply! If you know people who might be interested, please spread the word! Application deadline is Dec 9, 2025, and there is no application fee:
16
160
611
I'm really excited about our new paper!! š£ 'Reinforcement Learning Improves Traversal of Hierarchical Knowledge in LLMs' Contrary to belief that RL ft degrades memorized knowledge, RL-enhanced models consistently outperform base/SFT on knowledge recall by 24pp! RL teaches
18
50
421
On my way to #EMNLP2025 šØš³ Iāll be presenting our work (Oral) on Nov 5, Special Theme session, Room A106-107 at 14:30. Letās talk brains š§ , machines š¤, and everything in between :D Looking forward to all the amazing discussions!
10
41
272
cool idea from Meta What if we augment CoT + RLās token space thinking into a ālatent spaceā? This research proposes āThe Free Transformerā, with a way to let LLMs make global decisions within a latent space (via VAE encoder) that could later simplify autoregressive sampling
14
59
442
Lots of work on cross-lingual alignment encourages multilingual LLMs to generalize knowledge across languages. But this push for uniformity creates a tension: what happens to knowledge that should remain local? We look into this trade-off of transfer and cultural erasure:š§µ
3
19
61
So proud of Ruijie! He joined my lab as a sophomore and has grown with us ever since. I still remember his very first lab presentation, he wrote every equation by hand in real time. I knew right then he was different. He started with RL theory, explored robustness, then moved
Life update: Iāve successfully defended my PhD thesis today and will soon be joining the GEAR Lab as a research scientist for building humanoid robot foundation model. Itās been such a wonderful journey at Maryland, already starting to miss it!
5
11
206
Life update: Iāve successfully defended my PhD thesis today and will soon be joining the GEAR Lab as a research scientist for building humanoid robot foundation model. Itās been such a wonderful journey at Maryland, already starting to miss it!
35
19
368
Today's AI agents are optimized to complete tasks in one shot. But real-world tasks are iterative, with evolving goals that need collaboration with users. We introduce collaborative effort scaling to evaluate how well agents work with peopleānot just complete tasks š§µ
6
52
277
š§ š Excited to introduce Supervised Reinforcement Learningāa framework that leverages expert trajectories to teach small LMs how to reason through hard problems without losing their minds. 𤯠Better than SFT && RLVR. Read more: https://t.co/taEL8Vk4X5
#llms #RL #reasoning
13
64
336
@karpathy observed LLMs are "silently collapsed...only know 3 jokes". We prove this is mathematically inevitable due to RLHF + human psychology. But these capabilities aren't lost, just hidden ā and easily restored. This means AI benchmarks are measuring training artifacts.š§µ
8
20
154
We scaled up an "alternative" paradigm in RL: *divide and conquer*. Compared to Q-learning (TD learning), divide and conquer can naturally scale to much longer horizons. Blog post: https://t.co/xtXBzya0bI Paper: https://t.co/nqYkLucsWu ā
13
77
476
š AI can sketch what it thinks - no external tools, just pure latents! VLMs can see but canāt imagineš§ What if they can sketch their thoughts directly?šØ Meet Latent Sketchpad ā the next leap for MLLMs, giving models an internal visual scratchpad to truly think in images.
6
30
194
New Nvidia paper shows how a single LLM can teach itself to reason better. It creates 3 roles from the same model, a Proposer, a Solver, and a Judge. The Proposer writes hard but solvable questions that stretch the model. The Solver answers those questions with clear steps and
21
86
445