kakakbibibi Profile Banner
kabi Profile
kabi

@kakakbibibi

Followers
277
Following
853
Media
38
Statuses
363

Ph.D. Student in GSAI, @RenminUniv | Prev. Intern in @Alibaba_Qwen | Recent Works: Qwen2.5, AUTOIF, ARPO, WebThinker, Search-o1

Beijing
Joined August 2023
Don't wanna be here? Send us removal request.
@kakakbibibi
kabi
28 days
🥳 Introducing AEPO, an entropy-balanced agentic RL algorithm! 🙌 AEPO achieves diverse rollout sampling and prioritized learning of high-entropy tokens by balancing entropy. 📈 Impressive results: GAIA (65%), HLE (26%), and Webwalker (70%) on Pass@5! https://t.co/bOjHOAGngH
Tweet card summary image
huggingface.co
2
5
9
@_akhaliq
AK
7 days
V-Thinker Interactive Thinking with Images
2
11
77
@kakakbibibi
kabi
3 days
An interesting exploration of deeply combining visual tools with reasoning!🥳
@_akhaliq
AK
7 days
V-Thinker Interactive Thinking with Images
0
0
1
@HuggingPapers
DailyPapers
7 days
Tencent WeChat and BUPT introduce V-Thinker A general-purpose multimodal reasoning assistant that enables "Interactive Thinking with Images" through end-to-end reinforcement learning. It actively edits, annotates, and transforms images to solve complex problems.
1
5
24
@TheTuringPost
TuringPost
12 days
11 New Policy Optimization techniques ▪️ BAPO (BAlanced) ▪️ Training-Free GRPO ▪️ ASPO (Asymmetric Importance Sampling) ▪️ ICPO (In-Context) ▪️ GEPO (Graph-Enhanced) ▪️ IGPO (Information Gain-based) ▪️ AEPO (Agentic Entropy-Balanced) ▪️ AT-GRPO (Agent- and Turn-wise) ▪️ DGPO
9
87
433
@_reachsumit
Sumit
19 days
DeepAgent: A General Reasoning Agent with Scalable Toolsets @XiaoxiLi0111 et al. introduces a deep reasoning agent that autonomously thinks, discovers tools, and executes actions within a unified reasoning process. 📝 https://t.co/4S4AMaUBDh 👨🏽‍💻 https://t.co/9OVAW80AqO
Tweet card summary image
github.com
🛠️ DeepAgent: A General Reasoning Agent with Scalable Toolsets - RUC-NLPIR/DeepAgent
1
8
25
@WenxiangJiao
Jiao Wenxiang
18 days
Glad to share DeepAgent! 🔍Auto Tool Search​​: Dynamically finds & uses right tools on the fly. 🧠​​Memory Folding​​: Brain-inspired memory condenses progress for efficient reasoning restarts. 🛠️​​ToolPO​​: RL training w/ fine-grained credit for tool use via a safe LLM simulator.
@XiaoxiLi0111
Xiaoxi Li
18 days
Excited to announce our new work: 🛠️DeepAgent: A General Reasoning Agent with Scalable Toolsets Paper: https://t.co/BYL6aiuoWg Github:
0
14
69
@HuggingPapers
DailyPapers
18 days
DeepAgent introduces Autonomous Memory Folding & ToolPO for efficient RL training. It excels on 8 benchmarks, including tasks with 16,000+ APIs! Experience DeepAgent in action: 🎥 https://t.co/SCslvSMBPJ Paper:
Tweet card summary image
huggingface.co
0
3
8
@javaeeeee1
Dmitry Noranovich
18 days
DeepAgent: A General Reasoning Agent with Scalable Toolsets https://t.co/PdTI0EJnLF
Tweet card summary image
huggingface.co
0
1
3
@XiaoxiLi0111
Xiaoxi Li
18 days
Demo: 🚀 DeepAgent showcases three powerful capabilities: • 16,000+ RapidAPIs for general tasks • Navigation & embodied AI in interactive environments • Deep research with web search, page browsing, code execution, ...
0
3
6
@kakakbibibi
kabi
16 days
Interesting work!🥰
@XiaoxiLi0111
Xiaoxi Li
18 days
Demo: 🚀 DeepAgent showcases three powerful capabilities: • 16,000+ RapidAPIs for general tasks • Navigation & embodied AI in interactive environments • Deep research with web search, page browsing, code execution, ...
0
0
1
@kakakbibibi
kabi
16 days
Thank your for sharing our DeepAgent🤩
@_akhaliq
AK
18 days
DeepAgent A General Reasoning Agent with Scalable Toolsets
0
0
2
@AINativeF
AI Native Foundation
18 days
1. DeepAgent: A General Reasoning Agent with Scalable Toolsets 🔑 Keywords: DeepAgent, tool discovery, action execution, autonomous memory folding, reinforcement learning 💡 Category: Knowledge Representation and Reasoning 🌟 Research Objective: - The paper introduces
1
1
2
@newlinedotco
💥 \newline
17 days
DeepAgent uses an autonomous memory folding mechanism: 1. Compresses past interactions into structured memories. 2. Reduces error accumulation. 3. Preserves critical info for long-horizon interactions.
1
1
0
@DaiJoshua48988
Auto arxiv
17 days
DeepAgent: A General Reasoning Agent with Scalable Toolsets Introduces ToolPO, an end-to-end RL strategy for teaching general-purpose tool use in LLMs. https://t.co/FQUZSZZ8ay
Tweet card summary image
arxiv.org
Large reasoning models have demonstrated strong problem-solving abilities, yet real-world tasks often require external tools and long-horizon interactions. Existing agent frameworks typically...
0
1
1
@jenray1986
Jenray
17 days
DeepAgent: The AI That Thinks, Learns, and Remembers — A Paradigm Shift for Autonomous Agents...... https://t.co/YIKYSA8D2t
0
1
1
@aisearchio
⚡AI Search⚡
17 days
DeepAgent is a new AI that figures out how to use tools all by itself to solve complex tasks. It even has a smart memory system to handle super long projects without getting confused. Details below 👇
1
1
7
@Dhiraj7kr
Dhiraj ●─◯ The AI Engineer
17 days
DeepAgent A General Reasoning Agent with Scalable Toolsets
1
1
1
@hasantoxr
Hasan Toor
17 days
🚨BREAKING: Researchers just built an AI that teaches itself how to use new tools. It’s called DeepAgent, and it might be the first real step toward autonomous reasoning and might be the first real AGI prototype. Here’s why it’s a massive deal👇
12
12
41
@kakakbibibi
kabi
16 days
Thank you for sharing our work!🥰
@hasantoxr
Hasan Toor
17 days
🚨BREAKING: Researchers just built an AI that teaches itself how to use new tools. It’s called DeepAgent, and it might be the first real step toward autonomous reasoning and might be the first real AGI prototype. Here’s why it’s a massive deal👇
0
0
1
@hasantoxr
Hasan Toor
17 days
The Results Are Ridiculous Across 8 benchmarks ToolBench, TMDB, Spotify, WebShop, GAIA, HLE DeepAgent-32B beats ReAct-GPT4o and DeepSeek-R1 by huge margins. In open-set tasks (10K+ tools), it’s the first model to stay stable. This isn’t a demo it’s the first scalable general
1
2
4