lupantech Profile Banner
Pan Lu Profile
Pan Lu

@lupantech

Followers
6K
Following
3K
Media
252
Statuses
1K

Postdoc @Stanford | PhD @CS_UCLA @uclanlp | Amazon/Bloomberg/Qualcomm Fellows | Ex @Tsinghua_Uni @Microsoft @allen_ai | ML/NLP: AI4Math, AI4Science, LLM, Agents

Palo Alto
Joined April 2016
Don't wanna be here? Send us removal request.
@lupantech
Pan Lu
22 days
🔥Introducing #AgentFlow, a new trainable agentic system where a team of agents learns to plan and use tools in the flow of a task. 🌐 https://t.co/Smp4uMNGI3 📄 https://t.co/e4pb6lnGqe AgentFlow unlocks full potential of LLMs w/ tool-use. (And yes, our 3/7B model beats GPT-4o)👇
30
239
1K
@karpathy
Andrej Karpathy
4 days
Beautiful technical debugging detective longread that starts with a suspicious loss curve and ends all the way in the Objective-C++ depths of PyTorch MPS backend of addcmul_ that silently fails on non-contiguous output tensors. I wonder how long before an LLM can do all of this.
@ElanaPearl
Elana Simon
7 days
New blog post: The bug that taught me more about PyTorch than years of using it started with a simple training loss plateau... ended up digging through optimizer states, memory layouts, kernel dispatch, and finally understanding how PyTorch works!
194
350
4K
@BenHart_Freedom
Ben Hart
10 hours
"The Case for TERM LIMITS." Sen. Mitch McConnell can no longer, walk, speak, or think. But he votes in the U.S. Senate.
5
8
32
@ElanaPearl
Elana Simon
7 days
New blog post: The bug that taught me more about PyTorch than years of using it started with a simple training loss plateau... ended up digging through optimizer states, memory layouts, kernel dispatch, and finally understanding how PyTorch works!
43
165
2K
@james_y_zou
James Zou
7 days
It’s here! #Agents4Science recording is now on YouTube! 🏆 3 Best Paper talks ⚡️ 11 Spotlights 🧠 Panel on the future of AI agent-driven science 📚 Lessons + surprises from this first-of-its kind conf Full analysis of submissions + reviews coming soon! https://t.co/k4ksIWRaZy
3
31
132
@uclanlp
uclanlp
11 days
.@kaiwei_chang is getting a full house for his talk on “mathematical reasoning in visual context” at the Towards Comprehensive Reasoning in Vision-Language Models tutorial at #ICCV2025. Still time to come and engage in room 318A!
0
9
42
@tmgindustrialUS
TMG Industrial USA
8 hours
At first, it’s just a tool. But over time, the wheel balancer becomes essential—making sure every wheel is perfectly balanced, every ride is smooth, and every turn is precise. It’s the hidden hero that ensures your tires last longer, and your drive stays steady.
0
0
2
@alex_prompter
Alex Prompter
18 days
Holy shit. MIT just built an AI that can rewrite its own code to get smarter 🤯 It’s called SEAL (Self-Adapting Language Models). Instead of humans fine-tuning it, SEAL reads new info, rewrites it in its own words, and runs gradient updates on itself literally performing
648
2K
12K
@lupantech
Pan Lu
17 days
Thank you very much for covering our work @_akhaliq! 🤗
@_akhaliq
AK
22 days
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use
0
1
19
@HuggingPapers
DailyPapers
22 days
AgentFlow: In-the-Flow Optimization for LLM Agents A new trainable, modular agentic system that optimizes its planner live within the multi-turn loop. Achieve +14.9% on search, +14.0% on agentic reasoning, and +14.5% on math, outperforming models like GPT-4o with a 7B backbone.
2
7
23
@Marktechpost
Marktechpost AI Dev News ⚡
22 days
Stanford Researchers Released AgentFlow: In-the-Flow Reinforcement Learning RL for Modular, Tool-Using AI Agents AgentFlow is a trainable, modular agent framework—Planner, Executor, Verifier, Generator with explicit memory—that optimizes only the Planner in-loop using Flow-GRPO,
7
10
20
@saveonpropfirms
SaveOnPropFirms.com
2 days
Are you ready to change your financial future?
0
4
13
@HuggingPapers
DailyPapers
20 days
Stanford unveils AgentFlow: In-the-flow Agentic AI A new trainable modular system that learns live to plan & use tools, outperforming even GPT-4o on reasoning tasks with a 7B model. Huge gains: +14.9% search, +14.5% math.
2
6
20
@HuggingPapers
DailyPapers
20 days
Dive into AgentFlow's Flow-GRPO algorithm. Explore the code, try the demo, and see how to train your own modular agents on Hugging Face! Paper: https://t.co/2LT02uk0g6 Demo: https://t.co/F94dNJrBSH Model:
Tweet card summary image
huggingface.co
0
3
6
@lupantech
Pan Lu
22 days
This was a huge team effort. A massive shoutout to the brilliant minds behind the project: 🌟 @zhuofengli96475, @GhxIsaac, @SeungjuHan3, @ShengLiu_, @jianwen_xie, @yuz9yuz, @YejinChoinka, @james_y_zou ❤️And a huge thank you to our supporters @LambdaAPI, @RenPhil21, @StanfordHAI,
2
0
8
@lupantech
Pan Lu
22 days
Ready to see the magic for yourself? ✨ Dive into our interactive visualizations and watch AgentFlow's thought process, step-by-step. See how it plans, executes, and self-corrects in real-time: 📊 https://t.co/GOsN2U11Zt Or, put it to the test! Try our live demo on Hugging Face
0
0
10
@NSSGA
National Stone, Sand & Gravel Association
20 days
Celebrate the materials that build America. Join us this ROCKtober 2025!
5
12
102
@lupantech
Pan Lu
22 days
More thinking time = better answers? For AgentFlow, the answer is a clear YES. ✅ We gave our agent a bigger "turn budget" at inference time. The result? Performance climbed steadily across all benchmarks. 📈 It uses the extra steps wisely for deeper research, trying new
0
0
6
@lupantech
Pan Lu
22 days
Does this only work for one specific model size? Nope. We tested AgentFlow with both 3B and 7B backbones. The result: our Flow-GRPO training delivered consistent, significant performance boosts for both. 📈 This shows our "in-the-flow" optimization is a robust approach that
0
0
5
@lupantech
Pan Lu
22 days
Is the training efficient? You bet. ⚡️ As AgentFlow trains with Flow-GRPO, it gets: ✅ Smarter: Rewards (success rate) steadily increase. ✅ Faster: It learns to solve problems in fewer steps, making its solutions more concise. Compared to traditional tool-use RL, our agentic
0
1
7
@lupantech
Pan Lu
22 days
But does it really learn to plan better? Let's look at an example. Before training: The agent gets stuck in a loop. It tries a tool, fails, repeats the exact same mistake, and gives up. 🔁 After Flow-GRPO training: It hits the same error. But instead of giving up, it changes
0
0
7
@theIJreport
The Investment Journal
12 hours
$LULU x NFL is the rare collab that could actually move the stock — men’s apparel is one of Lululemon’s fastest-growing segments. Athletes today, earnings tomorrow.
0
0
0
@lupantech
Pan Lu
22 days
So does this training actually work? Absolutely. The Planner becomes a tool-use expert. 🧠 It learns to pick the right tool for the right job: ➡️ For broad questions, it learns to use Google Search more. ➡️ For specialized medical questions, it smartly switches to Wikipedia &
0
0
5
@lupantech
Pan Lu
22 days
How do you train an agent for complex, multi-step tasks? 🤔 The reward (success!) only comes at the end. How does the Planner know which early decisions were the right ones? Our solution: a new RL algorithm called Flow-GRPO. 💡 The Core Idea: We broadcast the final outcome
0
0
10
@lupantech
Pan Lu
22 days
So how does the Planner make its smart decisions? It has a powerful set of tools at its disposal. 🧰 For any given task, the Planner can choose the best tools for the job: 🐍 Python Coder: To solve math, run logic, or analyze data. 🔍 Google Search: For the latest info from the
0
0
5
@lupantech
Pan Lu
22 days
So, what's the secret behind these results? 🤫 AgentFlow isn't one giant model. It's a coordinated team of four specialized agents, each with a clear job: 🧭 Planner: The strategist. Decides the next step and which tool to use. 🛠️ Executor: The doer. Invokes the tool and gets
0
1
9
@scottdclary
Scott D. Clary
8 hours
A sign you're becoming successful: People start calling you lucky. They don't see the 4am wake-ups. The missed parties. The failed attempts. The years of nothing working. They see the result and call it luck because that's easier than admitting they didn't do the work. Let them
0
5
9