Zach Xu @nehzux X Profile

Zach Xu

@nehzux

Followers

101

Following

92

Media

2

Statuses

17

CS PhD @UChicago on LLM. I evolve myself (slowly). @VirtueAI_co

San Francisco

Joined July 2015

Don't wanna be here? Send us removal request.

Zach Xu

@nehzux

16 days

RT @togethercompute: 🛡️ VirtueGuard is LIVE on Together AI 🚀. AI security and safety model that screens input and output for harmful conten….

0

5

0

Zach Xu

@nehzux

20 days

RT @Chi_Wang_: 🚀 Meet MassGen! 🛠️.An open-source project for multi-agent scaling. Inspired by @grok Heavy & Gemini DeepThink. Enable parall….

0

44

0

Zach Xu

@nehzux

1 month

RT @james_y_zou: 📢New conference where AI is the primary author and reviewer! Current venues don't allow AI-writte….

0

127

0

Zach Xu

@nehzux

2 months

I'm incredibly grateful to my co-authors @ShangZhu18 @JueWANG26088228 @JunlinWang3 @ben_athi @Chi_Wang_ @james_y_zou @ce_zhang.

0

3

Zach Xu

@nehzux

2 months

Bottom line: "Divide and Conquer" isn't a silver bullet, but with a principled strategy, it's a powerful pathway to handling massive contexts. Our framework tells you when and why. Dive into the details in our new paper!. Link:

arxiv.org

We investigate the challenge of applying Large Language Models (LLMs) to long texts. We propose a theoretical framework that distinguishes the failure modes of long context tasks into three...

1

0

1

Zach Xu

@nehzux

2 months

We tested this on different tasks. The results show a clear "sweet spot" for chunking: it dominates when model confusion is high, but the task doesn't require seeing everything at once. For tasks with extreme cross-chunk dependency, single-shot is still better.

1

0

1

Zach Xu

@nehzux

2 months

So how do we manage this? We built a simple system with a Planner, Workers, and a Manager. The "Planner" is an LLM that automatically designs the prompts for the other agents to minimize "aggregator noise" 🧩 and get the best results.

1

0

Zach Xu

@nehzux

2 months

FINDING: A weaker LLM using our method can OUTPERFORM a stronger one (like GPT-4o) on certain long-context tasks. Why? Because "model noise" 🤯 for the big model can grow superlinearly on the full text, making it more confused than weaker models are on smaller, manageable chunks.

1

0

Zach Xu

@nehzux

2 months

🤝 Task Noise: cross-chunk dependencies that can't be handled by processing each segment in isolation. 🤯 Model Noise: the model's performance degradation as the input length increases. 🧩 Aggregator Noise: incorrect combination of partial results from each chunk.

1

0

Zach Xu

@nehzux

2 months

To understand this, we introduce our Noise Decomposition Framework. It pinpoints why LLMs fail on long-context tasks by breaking the final error down into three distinct parts. This reveals the core trade-off between three noises:.

1

0

Zach Xu

@nehzux

2 months

LLMs are getting more powerful, but they still struggle with super long documents. A common trick is "Divide and Conquer" - chop it up, process chunks, and combine. But. when does this actually work? And when does it fail catastrophically?. We investigated. 🧵.

1

6

12

Zach Xu

@nehzux

2 months

RT @NewInML: New to ML research? Never published at ICML? Don't miss this!. Check out the New in ML workshop at ICML 2025 — no rejections,….

openreview.net

Welcome to the OpenReview homepage for ICML 2025 Workshop NewInML

0

14

0

Zach Xu

@nehzux

4 months

RT @GoogleDeepMind: Human generated data has fueled incredible AI progress, but what comes next? 📈. On the latest episode of our podcast, @….

0

265

0

Zach Xu

@nehzux

5 months

RT @RichardSSutton: I am pretty happy with this 30-minute summary of my views on the current state of AI and alignment. .

0

101

0

Zach Xu

@nehzux

5 months

RT @karpathy: New 2h11m YouTube video: How I Use LLMs. This video continues my general audience series. The last one focused on how LLMs ar….

0

2K

0

Zach Xu

@nehzux

9 months

RT @NewInML: 📢 BIG ANNOUNCEMENT . NewInML is back at @NeurIPSConf on Dec 10th!. Join us for insights from 3 incredible speakers: @tomgoldst….

0

5

0

Zach Xu

@nehzux

1 year

RT @NeurIPSConf: Soliciting participants for the NeurIPS 2024 Checklist Assistant Study! . Pre-register before the abstract submission dead….

0

4

0