kabi
@kakakbibibi
Followers
277
Following
853
Media
38
Statuses
363
Ph.D. Student in GSAI, @RenminUniv | Prev. Intern in @Alibaba_Qwen | Recent Works: Qwen2.5, AUTOIF, ARPO, WebThinker, Search-o1
Beijing
Joined August 2023
🥳 Introducing AEPO, an entropy-balanced agentic RL algorithm! 🙌 AEPO achieves diverse rollout sampling and prioritized learning of high-entropy tokens by balancing entropy. 📈 Impressive results: GAIA (65%), HLE (26%), and Webwalker (70%) on Pass@5! https://t.co/bOjHOAGngH
huggingface.co
2
5
9
An interesting exploration of deeply combining visual tools with reasoning!🥳
0
0
1
Tencent WeChat and BUPT introduce V-Thinker A general-purpose multimodal reasoning assistant that enables "Interactive Thinking with Images" through end-to-end reinforcement learning. It actively edits, annotates, and transforms images to solve complex problems.
1
5
24
11 New Policy Optimization techniques ▪️ BAPO (BAlanced) ▪️ Training-Free GRPO ▪️ ASPO (Asymmetric Importance Sampling) ▪️ ICPO (In-Context) ▪️ GEPO (Graph-Enhanced) ▪️ IGPO (Information Gain-based) ▪️ AEPO (Agentic Entropy-Balanced) ▪️ AT-GRPO (Agent- and Turn-wise) ▪️ DGPO
9
87
433
DeepAgent: A General Reasoning Agent with Scalable Toolsets @XiaoxiLi0111 et al. introduces a deep reasoning agent that autonomously thinks, discovers tools, and executes actions within a unified reasoning process. 📝 https://t.co/4S4AMaUBDh 👨🏽💻 https://t.co/9OVAW80AqO
github.com
🛠️ DeepAgent: A General Reasoning Agent with Scalable Toolsets - RUC-NLPIR/DeepAgent
1
8
25
Glad to share DeepAgent! 🔍Auto Tool Search: Dynamically finds & uses right tools on the fly. 🧠Memory Folding: Brain-inspired memory condenses progress for efficient reasoning restarts. 🛠️ToolPO: RL training w/ fine-grained credit for tool use via a safe LLM simulator.
Excited to announce our new work: 🛠️DeepAgent: A General Reasoning Agent with Scalable Toolsets Paper: https://t.co/BYL6aiuoWg Github:
0
14
69
DeepAgent introduces Autonomous Memory Folding & ToolPO for efficient RL training. It excels on 8 benchmarks, including tasks with 16,000+ APIs! Experience DeepAgent in action: 🎥 https://t.co/SCslvSMBPJ Paper:
huggingface.co
0
3
8
DeepAgent: A General Reasoning Agent with Scalable Toolsets https://t.co/PdTI0EJnLF
huggingface.co
0
1
3
Demo: 🚀 DeepAgent showcases three powerful capabilities: • 16,000+ RapidAPIs for general tasks • Navigation & embodied AI in interactive environments • Deep research with web search, page browsing, code execution, ...
0
3
6
Thank your for sharing our DeepAgent🤩
0
0
2
1. DeepAgent: A General Reasoning Agent with Scalable Toolsets 🔑 Keywords: DeepAgent, tool discovery, action execution, autonomous memory folding, reinforcement learning 💡 Category: Knowledge Representation and Reasoning 🌟 Research Objective: - The paper introduces
1
1
2
DeepAgent uses an autonomous memory folding mechanism: 1. Compresses past interactions into structured memories. 2. Reduces error accumulation. 3. Preserves critical info for long-horizon interactions.
1
1
0
DeepAgent: A General Reasoning Agent with Scalable Toolsets Introduces ToolPO, an end-to-end RL strategy for teaching general-purpose tool use in LLMs. https://t.co/FQUZSZZ8ay
arxiv.org
Large reasoning models have demonstrated strong problem-solving abilities, yet real-world tasks often require external tools and long-horizon interactions. Existing agent frameworks typically...
0
1
1
DeepAgent: The AI That Thinks, Learns, and Remembers — A Paradigm Shift for Autonomous Agents...... https://t.co/YIKYSA8D2t
0
1
1
DeepAgent is a new AI that figures out how to use tools all by itself to solve complex tasks. It even has a smart memory system to handle super long projects without getting confused. Details below 👇
1
1
7
DeepAgent A General Reasoning Agent with Scalable Toolsets
1
1
1
🚨BREAKING: Researchers just built an AI that teaches itself how to use new tools. It’s called DeepAgent, and it might be the first real step toward autonomous reasoning and might be the first real AGI prototype. Here’s why it’s a massive deal👇
12
12
41
The Results Are Ridiculous Across 8 benchmarks ToolBench, TMDB, Spotify, WebShop, GAIA, HLE DeepAgent-32B beats ReAct-GPT4o and DeepSeek-R1 by huge margins. In open-set tasks (10K+ tools), it’s the first model to stay stable. This isn’t a demo it’s the first scalable general
1
2
4