Zefan Cai @Zefan_Cai X Profile

Zefan Cai

@Zefan_Cai

Followers

211

Following

159

Media

18

Statuses

117

Now Ph.D student @UWMadison Previous @PKU1898

Joined May 2023

Don't wanna be here? Send us removal request.

Zefan Cai

@Zefan_Cai

1 month

RT @karpathy: The hottest new programming language is English.

0

6K

0

Zefan Cai

@Zefan_Cai

1 month

RT @Xinyu2ML: 🚀 Super excited to share Multiverse!. 🏃 It’s been a long journey exploring the space between model design and hardware effici….

0

20

0

Zefan Cai

@Zefan_Cai

2 months

I feel that the long context is the final important question. Once you could put all the training data used in SFT into context and still keep the inference cost constant, we could solve most of the current problems.

0

2

Zefan Cai

@Zefan_Cai

2 months

If you feel our work useful, please site us:. @misc{huang2025vista,. title={VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection}, . author={Zeyi Huang and Yuyang Ji and Anirudh Sundara Rajan and Zefan Cai and Wen Xiao and Junjie Hu and.

0

Zefan Cai

@Zefan_Cai

2 months

If VisTA sparks ideas for adaptive tool-using agents, drop a ⭐ on the repo and cite us: 2505.20289. Let’s build vision systems that choose wisely 🧠⚡️ #AI #Vision #ReinforcementLearning #LLM #OpenSource.

0

1

Zefan Cai

@Zefan_Cai

2 months

Tech bits 🛠️.• Discrete action space = tool IDs.• Reward = downstream task score.• GRPO stabilizes updates across tool groups.• Training: 4×A100, 1-day synth RL rollouts.

0

Zefan Cai

@Zefan_Cai

2 months

Plug & Play ⚙️.Just wrap your favorite vision-LLM: GPT-4o, Qwen-VL, whatever. VisTA drives the tool calls, the LLM focuses on reasoning.

0

Zefan Cai

@Zefan_Cai

2 months

Generalization FTW 🚀.On unseen tasks, VisTA still outperforms prompt-engineering & finetune baselines. Learning to explore > memorizing fixed recipes.

0

Zefan Cai

@Zefan_Cai

2 months

🤖 During inference, VisTA dynamically picks specialized modules: OCR for charts, geometric solvers for diagrams, etc. The policy remains model-agnostic—swap the underlying LLM and keep the same tool selector.

0

Zefan Cai

@Zefan_Cai

2 months

🏅 SOTA performance.• #1 ChartQA (chart reasoning).• #1 Geometry3K (plane-geometry QA).• #1 MathVerse (math diagram understanding).—All with the SAME agent policy!

0

Zefan Cai

@Zefan_Cai

2 months

Given a user query, the agent selects tools from a pre-defined set of external tools. The tools are applied to the image, and their outputs and the query are fed.to a frozen reasoner model.

0

Zefan Cai

@Zefan_Cai

2 months

0

Zefan Cai

@Zefan_Cai

2 months

Core idea: treat tool choice as an action. VisTA explores a library of visual modules, gets task feedback, and learns a policy with Group Relative Policy Optimization (GRPO)—zero extra supervision needed.

0

Zefan Cai

@Zefan_Cai

2 months

Meet VisTA (VisualToolAgent): a RL framework that teaches vision models to PICK the right tool for every visual-reasoning task—no handcrafted prompts, no human demos. Paper 👉

arxiv.org

We introduce VisTA, a new reinforcement learning framework that empowers visual agents to dynamically explore, select, and combine tools from a diverse library based on empirical performance....

0

1

Zefan Cai

@Zefan_Cai

2 months

Project Page: ARXIV: Code:

github.com

VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection - OoDBag/VisTA

0

1

Zefan Cai

@Zefan_Cai

2 months

RT @huang43602: 🚨Our new paper: VisualToolAgent (VisTA) 🚨. Visual agents learn to use tools—no prompts or supervision! . ✅RL via GRPO.✅Dec….

0

5

0

Zefan Cai

@Zefan_Cai

2 months

Really appreciate the support for VisualToolAgent (VisTA): our new RL-based framework for dynamic tool selection in visual reasoning.

AK

@_akhaliq

2 months

VisualToolAgent (VisTA). A Reinforcement Learning Framework for Visual Tool Selection

18

9

59

Zefan Cai

@Zefan_Cai

2 months

Super excited to work with my amazing collaborators @wendyxiao06091 @preminstrel @ChengLuo_lc @liyucheng_2 @zhendongucb @AnimaAnandkumar @JunjieHu12.

0

1

Zefan Cai

@Zefan_Cai

2 months

If R-KV helps your research, please cite 🎓.BibTeX & full details in the repo. Let’s make long-form reasoning efficient together! #AI #LLM #KVCache #MLOps #OpenSource #MLSys #SparseAttention.

0

Zefan Cai

@Zefan_Cai

2 months

Everything’s open-source: code, scripts, Jupyter analysis notebooks. 🚀 Fork it, test it, file issues—we’d love feedback. GitHub:

github.com

R-KV: Redundancy-aware KV Cache Compression for Reasoning Models - Zefan-Cai/R-KV

0