Pan Lu @lupantech X Profile

Pan Lu

@lupantech

Followers

6K

Following

3K

Media

252

Statuses

1K

Postdoc @Stanford | PhD @CS_UCLA @uclanlp | Amazon/Bloomberg/Qualcomm Fellows | Ex @Tsinghua_Uni @Microsoft @allen_ai | ML/NLP: AI4Math, AI4Science, LLM, Agents

https://t.co/UteMDf8uPX

Palo Alto

Joined April 2016

Don't wanna be here? Send us removal request.

Pan Lu

@lupantech

22 days

🔥Introducing #AgentFlow, a new trainable agentic system where a team of agents learns to plan and use tools in the flow of a task. 🌐 https://t.co/Smp4uMNGI3 📄 https://t.co/e4pb6lnGqe AgentFlow unlocks full potential of LLMs w/ tool-use. (And yes, our 3/7B model beats GPT-4o)👇

30

239

1K

Andrej Karpathy

@karpathy

4 days

Beautiful technical debugging detective longread that starts with a suspicious loss curve and ends all the way in the Objective-C++ depths of PyTorch MPS backend of addcmul_ that silently fails on non-contiguous output tensors. I wonder how long before an LLM can do all of this.

Elana Simon

@ElanaPearl

7 days

New blog post: The bug that taught me more about PyTorch than years of using it started with a simple training loss plateau... ended up digging through optimizer states, memory layouts, kernel dispatch, and finally understanding how PyTorch works!

194

350

4K

Ben Hart

@BenHart_Freedom

10 hours

"The Case for TERM LIMITS." Sen. Mitch McConnell can no longer, walk, speak, or think. But he votes in the U.S. Senate.

5

8

32

Elana Simon

@ElanaPearl

7 days

New blog post: The bug that taught me more about PyTorch than years of using it started with a simple training loss plateau... ended up digging through optimizer states, memory layouts, kernel dispatch, and finally understanding how PyTorch works!

43

165

2K

James Zou

@james_y_zou

7 days

It’s here! #Agents4Science recording is now on YouTube! 🏆 3 Best Paper talks ⚡️ 11 Spotlights 🧠 Panel on the future of AI agent-driven science 📚 Lessons + surprises from this first-of-its kind conf Full analysis of submissions + reviews coming soon! https://t.co/k4ksIWRaZy

3

31

132

uclanlp

@uclanlp

11 days

.@kaiwei_chang is getting a full house for his talk on “mathematical reasoning in visual context” at the Towards Comprehensive Reasoning in Vision-Language Models tutorial at #ICCV2025. Still time to come and engage in room 318A!

0

9

42

TMG Industrial USA

@tmgindustrialUS

8 hours

At first, it’s just a tool. But over time, the wheel balancer becomes essential—making sure every wheel is perfectly balanced, every ride is smooth, and every turn is precise. It’s the hidden hero that ensures your tires last longer, and your drive stays steady.

0

2

Alex Prompter

@alex_prompter

18 days

Holy shit. MIT just built an AI that can rewrite its own code to get smarter 🤯 It’s called SEAL (Self-Adapting Language Models). Instead of humans fine-tuning it, SEAL reads new info, rewrites it in its own words, and runs gradient updates on itself literally performing

648

2K

12K

Pan Lu

@lupantech

17 days

Thank you very much for covering our work @_akhaliq! 🤗

AK

@_akhaliq

22 days

In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

0

1

19

DailyPapers

@HuggingPapers

22 days

AgentFlow: In-the-Flow Optimization for LLM Agents A new trainable, modular agentic system that optimizes its planner live within the multi-turn loop. Achieve +14.9% on search, +14.0% on agentic reasoning, and +14.5% on math, outperforming models like GPT-4o with a 7B backbone.

2

7

23

Marktechpost AI Dev News ⚡

@Marktechpost

22 days

Stanford Researchers Released AgentFlow: In-the-Flow Reinforcement Learning RL for Modular, Tool-Using AI Agents AgentFlow is a trainable, modular agent framework—Planner, Executor, Verifier, Generator with explicit memory—that optimizes only the Planner in-loop using Flow-GRPO,

7

10

20

SaveOnPropFirms.com

@saveonpropfirms

2 days

Are you ready to change your financial future?

0

4

13

DailyPapers

@HuggingPapers

20 days

Stanford unveils AgentFlow: In-the-flow Agentic AI A new trainable modular system that learns live to plan & use tools, outperforming even GPT-4o on reasoning tasks with a 7B model. Huge gains: +14.9% search, +14.5% math.

2

6

20

DailyPapers

@HuggingPapers

20 days

Dive into AgentFlow's Flow-GRPO algorithm. Explore the code, try the demo, and see how to train your own modular agents on Hugging Face! Paper: https://t.co/2LT02uk0g6 Demo: https://t.co/F94dNJrBSH Model:

huggingface.co

0

3

6

Pan Lu

@lupantech

22 days

This was a huge team effort. A massive shoutout to the brilliant minds behind the project: 🌟 @zhuofengli96475, @GhxIsaac, @SeungjuHan3, @ShengLiu_, @jianwen_xie, @yuz9yuz, @YejinChoinka, @james_y_zou ❤️And a huge thank you to our supporters @LambdaAPI, @RenPhil21, @StanfordHAI,

2

0

8

Pan Lu

@lupantech

22 days

Ready to see the magic for yourself? ✨ Dive into our interactive visualizations and watch AgentFlow's thought process, step-by-step. See how it plans, executes, and self-corrects in real-time: 📊 https://t.co/GOsN2U11Zt Or, put it to the test! Try our live demo on Hugging Face

0

10

National Stone, Sand & Gravel Association

@NSSGA

20 days

Celebrate the materials that build America. Join us this ROCKtober 2025!

5

12

102

Pan Lu

@lupantech

22 days

More thinking time = better answers? For AgentFlow, the answer is a clear YES. ✅ We gave our agent a bigger "turn budget" at inference time. The result? Performance climbed steadily across all benchmarks. 📈 It uses the extra steps wisely for deeper research, trying new

0

6

Pan Lu

@lupantech

22 days

Does this only work for one specific model size? Nope. We tested AgentFlow with both 3B and 7B backbones. The result: our Flow-GRPO training delivered consistent, significant performance boosts for both. 📈 This shows our "in-the-flow" optimization is a robust approach that

0

5

Pan Lu

@lupantech

22 days

Is the training efficient? You bet. ⚡️ As AgentFlow trains with Flow-GRPO, it gets: ✅ Smarter: Rewards (success rate) steadily increase. ✅ Faster: It learns to solve problems in fewer steps, making its solutions more concise. Compared to traditional tool-use RL, our agentic

0

1

7

Pan Lu

@lupantech

22 days

But does it really learn to plan better? Let's look at an example. Before training: The agent gets stuck in a loop. It tries a tool, fails, repeats the exact same mistake, and gives up. 🔁 After Flow-GRPO training: It hits the same error. But instead of giving up, it changes

0

7

The Investment Journal

@theIJreport

12 hours

$LULU x NFL is the rare collab that could actually move the stock — men’s apparel is one of Lululemon’s fastest-growing segments. Athletes today, earnings tomorrow.

0

Pan Lu

@lupantech

22 days

So does this training actually work? Absolutely. The Planner becomes a tool-use expert. 🧠 It learns to pick the right tool for the right job: ➡️ For broad questions, it learns to use Google Search more. ➡️ For specialized medical questions, it smartly switches to Wikipedia &

0

5

Pan Lu

@lupantech

22 days

How do you train an agent for complex, multi-step tasks? 🤔 The reward (success!) only comes at the end. How does the Planner know which early decisions were the right ones? Our solution: a new RL algorithm called Flow-GRPO. 💡 The Core Idea: We broadcast the final outcome

0

10

Pan Lu

@lupantech

22 days

So how does the Planner make its smart decisions? It has a powerful set of tools at its disposal. 🧰 For any given task, the Planner can choose the best tools for the job: 🐍 Python Coder: To solve math, run logic, or analyze data. 🔍 Google Search: For the latest info from the

0

5

Pan Lu

@lupantech

22 days

So, what's the secret behind these results? 🤫 AgentFlow isn't one giant model. It's a coordinated team of four specialized agents, each with a clear job: 🧭 Planner: The strategist. Decides the next step and which tool to use. 🛠️ Executor: The doer. Invokes the tool and gets

0

1

9

Scott D. Clary

@scottdclary

8 hours

A sign you're becoming successful: People start calling you lucky. They don't see the 4am wake-ups. The missed parties. The failed attempts. The years of nothing working. They see the result and call it luck because that's easier than admitting they didn't do the work. Let them

0

5

9