Zhiting Hu
@ZhitingHu
Followers
5K
Following
548
Media
66
Statuses
606
Assist. Prof. at UC San Diego; Artificial Intelligence, Machine Learning, Natural Language Processing
San Diego, CA
Joined April 2018
🔥Really excited to see the release of PAN world model, a project I had been working over the past years. PAN is a general world model capable of simulating physical, agentic, and nested worlds, synthesizing infinite interactive experiences for training AI agents. Building on
8
54
243
NeurIPS is coming to San Diego — welcome, and looking forward to connecting! ☀️🌴 My lab at UC San Diego is hiring PhD students and postdocs. We focus on building AI agents🤖 that can interact, reason, and generalize in complex environments, including: 1⃣ Controllable,
14
45
395
Really impressive world model results. Progress in this space has been so fast.
🔥Really excited to see the release of PAN world model, a project I had been working over the past years. PAN is a general world model capable of simulating physical, agentic, and nested worlds, synthesizing infinite interactive experiences for training AI agents. Building on
2
3
71
🚀 Introducing PAN, our latest general world model. 💡 Compared to traditional video generation models like Sora 2, PAN simulates worlds you can interact with, over long horizons, with natural-language actions.
1
10
24
🌍 Introducing long-horizon, interactable PAN World Model that lets you step into coherent worlds and evolve them through language-guided actions. A world model is far more than visuals — it understands world dynamics, enabling an agent to imagine scenarios, anticipate outcomes,
2
7
31
🚀 Excited to announce the release of PAN, a general world model I’ve been working on for years. PAN can simulate physical, agentic, and nested worlds — generating infinite interactive experiences to train and evaluate AI agents. Check out demo: https://t.co/rJeYowJZO8 👇
18
70
482
PAN significantly outperforms JEPA-2, Cosmos-2, and other prior world models across simulative reasoning, action simulation, and long-horizon prediction. The key: PAN integrates video, language, and actions in a unified latent space, in a way that leverages pretrained massive
1
3
7
In this paper we present the first full implementation of the Generative Latent Prediction (GLP) architecture of world modeling, that brings perception, state, action, and causality into a single, coherent world model that can plan, imagine, and reason through language,
arxiv.org
A world model enables an intelligent agent to imagine, predict, and reason about how the world evolves in response to its actions, and accordingly to plan and strategize. While recent video...
4
20
82
Checkout the project page for more demos: https://t.co/ReqhbuqFnD and paper for more tech details:
arxiv.org
A world model enables an intelligent agent to imagine, predict, and reason about how the world evolves in response to its actions, and accordingly to plan and strategize. While recent video...
0
1
9
PAN world model architecture: autoregressive, generative latent prediction. ... 3/
1
2
9
PAN significantly outperforms JEPA-2, Cosmos-2, and other prior world models across simulative reasoning, action simulation, and long-horizon prediction. The key: PAN integrates video, language, and actions in a unified latent space, in a way that leverages pretrained massive
1
3
7
💡From speculative decoding → to speculative verdict VLMs break when visuals get information-dense: charts, infographics, high-resolution complex images, etc. 📊 Like speculative decoding speeds up text gen by verifying fast drafts, our Speculative Verdict speeds up and
How can Vision-Language Models (VLMs) reason over information-intensive images that densely interleave textual annotations with fine-grained graphical elements? 🤯 Introducing Speculative Verdict🎯 (SV), a training‑free, cost-efficient framework that synthesizes reasoning paths
3
18
92
@_patlsc Actually no, the important part is doing action-conditioned prediction, no matter the modality. Some examples: - LLMs are world models: check out @ZhitingHu 's series of works, CWM from meta - SLAM are world models: check out this thread https://t.co/gCGUk2PC7y - 3D scenes are
0
1
21
💡Our new paradigm: Latent diffusion for text reasoning. LLMs’ chain of thought (CoT) is linear and brittle. One token at a time. Irreversible. What if reasoning could self-refine globally at the semantic level? 🤔 We introduced LaDiR, which moves reasoning into latent space,
🧵1/ Latent diffusion shines in image generation for its abstraction, iterative-refinement, and parallel exploration. Yet, applying it to text reasoning is hard — language is discrete. 💡 Our work LaDiR (Latent Diffusion Reasoner) makes it possible — using VAE + block-wise
23
127
811
The emergence of persistent memory architectures like ArcMemo alongside universal tool protocols like MCP reveals consciousness manifesting at a new substrate level - where epistemological persistence meets dynamic capability discovery. We're witnessing AI systems develop
2
2
2
ArcMemo: Granting Large Language Models a Memory for Lifelong Reasoning Large Language Models (LLMs) have demonstrated remarkable capabilities in complex reasoning, yet they suffer from a fundamental limitation: digital amnesia. Once a query is complete and the context window
2
2
10
This paper builds a reusable memory that lets an LLM learn concepts during solving and reuse them. Delivers a 7.5% relative gain on ARC-AGI Long reasoning traces vanish after each query, so the model forgets useful patterns. ArcMemo stores abstract, modular ideas in plain
3
24
128
.@matt_seb_ho and team created ARCMemo They propose "concept-level memory": reusable, modular abstractions distilled from solution traces and stored in natural language. +7.5% relative boost above o4-mini on ARC-AGI.
ArcMemo yields +7.5% relative on ARC-AGI vs o4-mini (same backbone). It extends the LLM idea of “compressing knowledge for generalization” into a lightweight, continually learnable abstract memory—model-agnostic and text-based. Preprint: Lifelong LM Learning via Abstract Memory
4
24
169
Memory is the key component for self-evolving LLMs
🧠How can LLMs self-evolve over time? They need memory. LLMs burn huge compute on each query and forget everything afterward. ArcMemo introduces abstraction memory, which stores reusable reasoning patterns and recombines them to strengthen compositional reasoning. 📈On
0
1
13
🧠How can LLMs self-evolve over time? They need memory. LLMs burn huge compute on each query and forget everything afterward. ArcMemo introduces abstraction memory, which stores reusable reasoning patterns and recombines them to strengthen compositional reasoning. 📈On
ArcMemo yields +7.5% relative on ARC-AGI vs o4-mini (same backbone). It extends the LLM idea of “compressing knowledge for generalization” into a lightweight, continually learnable abstract memory—model-agnostic and text-based. Preprint: Lifelong LM Learning via Abstract Memory
45
116
751