Xiuyu Li @xiuyu_l X Profile

Xiuyu Li

@xiuyu_l

Followers

2K

Following

2K

Media

47

Statuses

287

Efficiently scaling agents. CS PhD student @berkeley_ai. Prev @NVIDIA @AIatMeta @Cornell.

https://t.co/Fl4rxVm1yR

Bay Area

Joined August 2017

Don't wanna be here? Send us removal request.

Xiuyu Li

@xiuyu_l

7 months

Scale smarter, not harder! Long CoT reasoning is powerful, but its sequential nature limits how efficiently and easily it can scale We incentivize LMs to divide and conquer subtasks in parallel, selectively gathering only the highest-quality explorations

Jiayi Pan

@jiayi_pirate

7 months

We explore a new dimension in scaling reasoning models in Adaptive Parallel Reasoning APR lets LMs learn to orchestrate both serial & parallel compute E2E via supervised training + RL — w/ better efficiency and scalability than long CoT on Countdown 🧵 https://t.co/BKLhZ4fHEt

3

22

90

John Yang

@jyangballin

2 days

New eval! Code duels for LMs ⚔️ Current evals test LMs on *tasks*: "fix this bug," "write a test" But we code to achieve *goals*: maximize revenue, cut costs, win users Meet CodeClash: LMs compete via their codebases across multi-round tournaments to achieve high-level goals

24

86

340

Samir Khaki

@samir_khaki

15 days

Visual reasoning isn’t just seeing — it’s about efficient retrieval across thousands of tokens and multiple turns of conversation. Meet SparseVILA (ICCV 2025) ⚡ Highlights: 🧩 Framework: Decoupled sparsity — query-agnostic prefill, query-aware decoding. 🧠 Speed: 4.0x faster

1

2

4

Zitong Yang

@ZitongYang0

20 days

The passing of the physicist Chen-Ning Yang ( https://t.co/LOY46RpBhz) saddens me. He has been a long-time hero and role model for me. Below is a short essay I wrote yesterday about Yang that I shared with many of my friends. I translated it into English using Gemini: ``` The

10

65

417

Yukang Chen

@yukangchen_

24 days

We open-sourced QeRL — Quantization-enhanced Reinforcement Learning ! 🧠 4-bit quantized RL training 💪 Train a 32B LLM on a single H100 GPU ⚙️ 1.7× faster overall training 🎯 Accuracy on par with bfloat16-level accuracy 🔥 Supports NVFP4 quantization format Moreover, we show

11

68

352

Xiuyu Li

@xiuyu_l

29 days

Thrilled to see our APR paper featured in the State of AI Report 2025! The future of agentic AI lies in parallel reasoning—scaling beyond single-threaded thought to handle truly long-horizon challenges.

Nathan Benaich

@nathanbenaich

29 days

🪩The one and only @stateofaireport 2025 is live! 🪩 It’s been a monumental 12 months for AI. Our 8th annual report is the most comprehensive it's ever been, covering what you *need* to know about research, industry, politics, safety and our new usage data. My highlight reel:

3

6

42

Chenfeng_X

@Chenfeng_X

1 month

🥳We’re releasing StreamDiffusionV2 for the live-stream community—from individual creators with one GPU to enterprise platforms with many. StreamDiffusionV2 is our follow-up to StreamDiffusion: #StreamDiffusion powered real products, but temporal consistency still bugged us.

12

45

223

Agentica Project

@Agentica_

1 month

Introducing Pepper🌶️! An open-source, real-time, event-driven architecture to power the next generation of proactive agents. Tired of static reactive chatbots? Pepper enables agents that anticipate your needs, actively engage, and work continuously in the background (think

11

61

483

Xinyu Yang

@Xinyu2ML

1 month

These days, LoRA seems less prominent in mainstream discussions compared to full FT. However, the post from @thinkymachines highlights that LoRA can actually match full FT in real-world customization scenarios! One year ago, one of my previous works ( https://t.co/yWjsYdG3xZ)

Thinking Machines

@thinkymachines

1 month

LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.

5

25

169

Yukang Chen

@yukangchen_

1 month

🚀 We open-sourced LongLive — interactive, real-time long-video generation. 👥Generates video in real time as users enter text prompts. ⚡️20.7 FPS on a single H100,⏱️up to 240s per clip. 🎬Fine-tunes SOTA short-video models (e.g., Wan) into long-video generators. 🌍One step

4

18

79

Chenfeng_X

@Chenfeng_X

1 month

Happy to share that we have two papers got accepted by @NeurIPSConf 2025 as #Spotlight papers! 1. 👼Angles Don’t Lie: Unlocking Training-Efficient RL from a Model’s Own Signals TL;DR: Token angles—the model’s self-generated signals—can reveal how well it grasps the data. By

18

33

308

Haocheng Xi

@HaochengXiUCB

1 month

🚀 Introducing Sparse VideoGen2 (SVG2) — Pareto-frontier video generation acceleration with semantic-aware sparse attention! 🏆Spotlight paper accepted by #NeurIPS2025 ✅ Training-free & plug-and-play ✅ Up to 2.5× faster on HunyuanVideo, 1.9× faster on Wan 2.1 ✅ SOTA quality

16

59

261

Guangxuan Xiao

@Guangxuan_Xiao

3 months

Just wrote a post on my understanding of the statistics behind block sparse attention. My take is that it works by using the "learned similarity gap," which creates a simple SNR formula connecting retrieval quality with model architecture. Read more:

guangxuanx.com

How can a language model comprehend a million-token document without drowning in O(N²) attention cost? A statistical model revealing the success of block sparse attention through learned similarity...

5

48

387

Xiuyu Li

@xiuyu_l

3 months

By using a Mamba projector for spatio-temporal fusion and pooling in VLM training, we achieve 8× token compression for long video understanding with SoTA performance. Last year, I spent quite some time on context compression, and one key lesson was clear: when compression is

Wonmin Byeon

@wonmin_byeon

3 months

🚀 New paper: STORM — Efficient VLM for Long Video Understanding STORM cuts compute costs by up to 8× and reduces decoding latency by 2.4–2.9×, while achieving state-of-the-art performance. Details + paper link in the thread ↓

0

9

Xiuyu Li

@xiuyu_l

3 months

And GPT-5 is a good model

1

0

3

Xiuyu Li

@xiuyu_l

3 months

I’ve seen people worry that LLMs have hit a wall after GPT-5’s release. I think that’s the wrong mindset. You can’t believe in AGI only when OpenAI delivers a miracle. The journey is longer than the hype cycles.

1

0

14

Guangxuan Xiao

@Guangxuan_Xiao

3 months

I've written the full story of Attention Sinks — a technical deep-dive into how the mechanism was developed and how our research ended up being used in OpenAI's new OSS models. For those interested in the details: https://t.co/0EAi2KQMMx

39

284

2K

Xiuyu Li

@xiuyu_l

3 months

Unrelated but curious note when watching today’s GPT-5 livestream: the generated subtitles always lagged 3–5 seconds behind the audio. The year is 2025, and we still don't have universal real-time ASR deployed in the cloud. Worth pondering.

0

4

Ryan Hanrui Wang

@hanrui_w

3 months

Announcing Eigen AI @Eigen_AI_Labs, the world’s first company dedicated to AEI — Artificial Efficient Intelligence. 🚀 The future of AI is already here; it’s simply not evenly distributed. Our mission is to close that gap by driving radical efficiency so that every person and

lnkd.in

This link will take you to a page that’s not on LinkedIn

Eigen AI

@Eigen_AI_Labs

3 months

🚀Founded by four dedicated MIT graduates, Eigen AI is the world's first company focusing on AEI – Artificial Efficient Intelligence, making AI accessible for all. Today OpenAI dropped GPT-OSS. We teamed up with our partners SGLang @lmsysorg and @NVIDIA to deliver open-source

1

13

59

Baifeng

@baifeng_shi

3 months

We just dropped a few new PS3 models, with SOTA performance compared to existing vision encoders such as SigLIP2, C-RADIOv2, AIMv2, InternViT2.5, and Perception Encoder! Coming along with several new VILA-HD models. Check it out👇 Models: https://t.co/UwjpBWpFBj Code:

4

16

85

Chenfeng_X

@Chenfeng_X

3 months

📢 Excited to sharing a little late update (before it is no longer news): I’ll be joining @UTAustin @UTCompSci as an Assistant Professor! I'm recruiting PhD students from @UTCompSci in the Fall 2025 cycle and also looking for RAs/interns! More info see https://t.co/JPDhVplhJX

31

406