Seungju Han @SeungjuHan3 X Profile

Seungju Han

@SeungjuHan3

Followers

1K

Following

2K

Media

14

Statuses

184

language models & reasoning. cs phd student @stanfordailab

https://t.co/NGLZqlUryt

Stanford, CA

Joined December 2020

Don't wanna be here? Send us removal request.

Hadi Pouransari

@HPouransari

1 month

Introducing Pretraining with Hierarchical Memories: Separating Knowledge & Reasoning for On-Device LLM Deployment 💡We propose dividing LLM parameters into 1) anchor (always used, capturing commonsense) and 2) memory bank (selected per query, capturing world knowledge). [1/X]🧵

11

116

643

Seohong Park

@seohong_park

28 days

Introducing *dual representations*! tl;dr: We represent a state by the "set of similarities" to all other states. This dual perspective has lots of nice properties and practical benefits in RL. Blog post: https://t.co/lw1PortD9E Paper: https://t.co/zYKFjyOy7C ↓

14

97

792

Pan Lu

@lupantech

29 days

🔥Introducing #AgentFlow, a new trainable agentic system where a team of agents learns to plan and use tools in the flow of a task. 🌐 https://t.co/Smp4uMNGI3 📄 https://t.co/e4pb6lnGqe AgentFlow unlocks full potential of LLMs w/ tool-use. (And yes, our 3/7B model beats GPT-4o)👇

30

240

1K

Suhas Kotha

@kothasuhas

2 months

Since compute grows faster than the web, we think the future of pre-training lies in the algorithms that will best leverage ♾ compute We find simple recipes that improve the asymptote of compute scaling laws to be 5x data efficient, offering better perf w/ sufficient compute

9

83

444

Logan Engstrom

@logan_engstrom

8 months

Want state-of-the-art data curation, data poisoning & more? Just do gradient descent! w/ @andrew_ilyas Ben Chen @axel_s_feldmann @wsmoses @aleks_madry: we show how to optimize final model loss wrt any continuous variable. Key idea: Metagradients (grads through model training)

9

35

176

Yejin Choi

@YejinChoinka

2 months

Honored to be back on TIME100 AI for 2025 — alongside my longtime heroes @drfeifei and @BarzilayRegina! 😍 The recognition goes to my amazing students and colleagues, who strive to find ways to use AI to better humanity, as opposed to making AI for the sake of making AI better

40

39

490

Ross Taylor

@rosstaylor90

2 months

Most takes on RL environments are bad. 1. There are hardly any high-quality RL environments and evals available. Most agentic environments and evals are flawed when you look at the details. It’s a crisis: and no one is talking about it because they’re being hoodwinked by labs

30

46

704

Boris Power

@BorisMPower

3 months

At @OpenAI, we believe that AI can accelerate science and drug discovery. An exciting example is our work with @RetroBiosciences, where a custom model designed improved variants of the Nobel-prize winning Yamanaka proteins. Today we published a closer look at the breakthrough. ⬇️

159

653

4K

Bryan Catanzaro

@ctnzr

3 months

Today we're releasing NVIDIA Nemotron Nano v2 - a 9B hybrid SSM that is 6X faster than similarly sized models, while also being more accurate. Along with this model, we are also releasing most of the data we used to create it, including the pretraining corpus. Links to the

38

240

1K

Kyunghyun Cho

@kchonyc

3 months

recently gave a talk on <Reality Checks> at two venues, and discussed (and rambled) about how leaderboard chasing is awesome (and we want it to continue) but that this isn't easy because everyone (me! me! me!) wants to write more papers. the link to the slide deck in the reply.

2

14

123

Ximing Lu

@GXiming

3 months

🚀 How far can RL scaling take LLMs? Drop ProRLv2! 🔥We keep expanding LLM’s reasoning boundaries through 3,000+ RL steps over 5 domains and set a new state-of-the-art ✨ among 1.5B reasoning models. 🔗Full blog: https://t.co/Xj1oaLK5gE 🤗Open model:

huggingface.co

Andrew Zhao

@_AndrewZhao

5 months

RL scaling is here https://t.co/IX8z8XV6WX

3

28

220

Andrew Gordon Wilson

@andrewgwils

3 months

A common takeaway from "the bitter lesson" is we don't need to put effort into encoding inductive biases, we just need compute. Nothing could be further from the truth! Better inductive biases mean better scaling exponents, which means exponential improvements with computation.

20

35

421

Chanwoo Park

@chanwoopark20

4 months

(1/x) Excited to share our new work on MAPoRL🍁: Multi-Agent Post-Co-Training for Collaborative LLMs with RL. Most current approaches just prompt pre-trained models and hope they’ll work together. But can we train LLM to discover the collaboration strategy?

10

5

48

Google DeepMind

@GoogleDeepMind

4 months

Gemini solved the math problems end-to-end in natural language (English). This differs from our results last year when experts first translated them into formal languages like Lean for specialized systems to tackle.

2

19

369

Seungju Han

@SeungjuHan3

4 months

life update: I'll be starting my PhD in CS at Stanford this September! I'm very excited to continue my research on reasoning of language models and to make new friends in the Bay Area! I'm deeply grateful to everyone who supported me and made this milestone possible

35

19

742

Seungju Han

@SeungjuHan3

4 months

https://t.co/AIgw5vJBHV for reproduce

github.com

Contribute to wade3han/reasoning_or_memorization development by creating an account on GitHub.

0

4

Seungju Han

@SeungjuHan3

4 months

https://t.co/KcfT8vHTSf i thought this paper interesting and tried to reproduce the numbers in Table 2 very impressive that models can memorize the problems in benchmarks, especially MATH500 / AIME24 / AMC23. GPQA, AIME25, and LiveMathBench are less memorized

2

18

Wen Sun

@WenSun1

4 months

Does RL actually learn positively under random rewards when optimizing Qwen on MATH? Is Qwen really that magical such that even RLing on random rewards can make it reason better? Following prior work on spurious rewards on RL, we ablated algorithms. It turns out that if you

Gokul Swamy

@g_k_swamy

4 months

Recent work has seemed somewhat magical: how can RL with *random* rewards make LLMs reason? We pull back the curtain on these claims and find out this unexpected behavior hinges on the inclusion of certain *heuristics* in the RL algorithm. Our blog post: https://t.co/fPFfw17IIz

1

14

102

Seungju Han

@SeungjuHan3

4 months

how do people fairly evaluate agents with web access on benchmarks like HLE or GPQA? there could be content directly related to the benchmark on the web (e.g. blogpost showing an example from the benchmark), how is this issue addressed?

0

5

Simo Ryu

@cloneofsimo

4 months

n-simplex attention makes incredible sense because of its honesty: it literally says you can put more compute on attention operation to get more gains: we've seen this trend so many times. This differs from lot of 'suspicious' claim, such as you can use less compute to perform

14

19

523