Siqiao Huang @KnightNemo_ X Profile

Siqiao Huang

@KnightNemo_

Followers

570

Following

240

Media

30

Statuses

142

Junior undergrad, Yao class @Tsinghua_Uni . Current intern @mldcmu. Interested in ML & Robotics. World Models / VLAs / Humanoid Foundation Models.

Joined August 2024

Don't wanna be here? Send us removal request.

Saining Xie

@sainingxie

22 hours

papers are kind of like movies: the first one is usually the best, and the sequels tend to get more complicated but not really more exciting. But that totally doesn’t apply to the DepthAnything series. @bingyikang's team somehow keeps making things simpler and more scalable each

Bingyi Kang

@bingyikang

1 day

After a year of team work, we're thrilled to introduce Depth Anything 3 (DA3)! 🚀 Aiming for human-like spatial perception, DA3 extends monocular depth estimation to any-view scenarios, including single images, multi-view images, and video. In pursuit of minimal modeling, DA3

5

29

398

Siqiao Huang

@KnightNemo_

2 days

When Dreamerv4 came out, the two takeaways for me are: 1. Diffusion Forcing / Streaming Video Gen techniques will be the mainstream algorithm choice in WMs 2. The gap between Video Generation Models and World Models is becoming increasingly small. If we have a good enough Video

arxiv.org

World models, which predict future transitions from past observation and action sequences, have shown great promise for improving data efficiency in sequential decision-making. However, existing...

Danijar Hafner

@danijarh

5 days

Excited for this podcast episode with TalkRL to be out! 🎙️ We talk about the story behind Dreamer 4, the details of scalable world models, and the future of robotics (and beyond) 🤖🌏🚀 Thanks for the fun conversation, @TalkRLPodcast

2

8

83

Siqiao Huang

@KnightNemo_

5 days

“The Limits of My World means the Limits of My Language” —— Siqiao Huang, Nov. 2025.

Fei-Fei Li

@drfeifei

5 days

“The philosopher Wittgenstein once wrote that “the limits of my language mean the limits of my world.” I’m not a philosopher. But I know at least for AI, there is more than just words. Spatial intelligence represents the frontier beyond language—the capability that links

0

17

Siqiao Huang

@KnightNemo_

8 days

There are some projects that are cool, some that are significant. But every once in a while, something like this comes across— and I just lean back in my chair and think, “Damn.” Congrats @li_yitang on this amazing project!!!

Yitang Li

@li_yitang

8 days

Meet BFM-Zero: A Promptable Humanoid Behavioral Foundation Model w/ Unsupervised RL👉 https://t.co/3VdyRWgOqb 🧩ONE latent space for ALL tasks ⚡Zero-shot goal reaching, tracking, and reward optimization (any reward at test time), from ONE policy 🤖Natural recovery & transition

2

5

23

Danijar Hafner

@danijarh

12 days

Today is my last day at @GoogleDeepMind. After almost exactly 10 years at Google including 12 internships and the last 2 1/2 years full time, it really feels like a chapter coming to an end. I'm grateful for all the experiences and friends I've made at Google and DeepMind. I

146

51

2K

Siqiao Huang

@KnightNemo_

12 days

Thanks @EmbodiedAIRead @yilun_chen_ for featuring our repo!!!

Embodied AI Reading Notes

@EmbodiedAIRead

13 days

Awesome World Models Github: https://t.co/IBANoRMmIA Newly released one-stop github repo on everything about World Modeling, spanning definition, theory, general approaches, use cases and evaluations in Embodied AI (as well as in other domains like NLP, Agent, etc). Organized

0

17

Siqiao Huang

@KnightNemo_

15 days

@JinWeiyang18434 Btw, I really liked the picture that Nano-Banana🍌 @GeminiApp generated🤣. It integrates the elements seamlessly, generative models nowadays are just super wild. From Left to Right: - Genie3 blogpost picture @jparkerholder @shlomifruchter - @ylecun 's renowned brain picture -

0

1

8

Siqiao Huang

@KnightNemo_

15 days

This repo covers key papers and research on World Models across multiple domains, including Embodied AI, Autonomous Driving, NLP, and more. If you find it useful, please give it a star ⭐! PRs are always welcome. 🔗: https://t.co/DDG1wU9WRB Shoutout to @JinWeiyang18434 , and we

github.com

A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling. - knightnemo/Awesome-World-Mo...

1

2

21

Siqiao Huang

@KnightNemo_

15 days

Introducing🌍 Awesome-World-Models, a one-stop github repo of everything there is to know about world models! Here is a new, curated one-stop resource list for everyone interested in "World Models," aiming to be a go-to guide for researchers and developers in the field. 🧵(1/n)

15

103

664

Siqiao Huang

@KnightNemo_

15 days

Thanks for sharing! 😎

Weiyang Jin

@JinWeiyang18434

15 days

https://t.co/bV7KMwCTvx Credit to @KnightNemo_ the WM reading list worth to checking!

0

2

Siqiao Huang

@KnightNemo_

20 days

Tagging a few people who may be interested: @huskydogewoof @_amirbar @Haoyu_Xiong_ @K_Sta8is @CSProfKGD @TongheZhang01 @li_yitang @leoliuym @xiao_ted @DrJimFan @ChongZitaZhang @chuning_zhu @liuziwei7 @ElijahGalahad @xxunhuang

1

0

7

Siqiao Huang

@KnightNemo_

20 days

Personally, I'm more interested in latent WMs. But since nobody is mentioning it, here are why pixel space also makes sense: 1. One's purpose determines one's standpoint. For sequential decision making, pixel space makes no sense; but for game simulation, pixel is everything. 2.

C Zhang

@ChongZitaZhang

21 days

On world model / egocentric visual dynamics model, also on building robotic simulation, also on building robotic genAI models: Being visually realistic doesn't mean being physically accurate and semantically correct.

6

2

87

Abhishek Gupta

@abhishekunique7

22 days

Punchline: World models == VQA (about the future)! Planning with world models can be powerful for robotics/control. But most world models are video generators trained to predict everything, including irrelevant pixels and distractions. We ask - what if a world model only

12

69

404

Siqiao Huang

@KnightNemo_

22 days

World Models in Game Simulations are cool. But the real challenge is using it to advance robotics. This comes in two folds: 1. As a source of data for policy training 2. As a verifier for tts and policy evaluation Glad to see both aspects coming into play in this awesome work.

Yanjiang Guo

@GYanjiang

23 days

Rollouts in the real world are slow and expensive. What if we could rollout trajectories entirely inside a world model (WM)? Introducing 🚀Ctrl-World🚀, a generative manipulation WM that can interact with advanced VLA policy in imagination. 🧵1/6

0

15

Siqiao Huang

@KnightNemo_

27 days

Not sure if now is the best time to do world model research, but it surely is good times for making world model memes🤣

11

40

376

Yuda Song

@yus167

1 month

🤖 Robots rarely see the true world's state—they operate on partial, noisy visual observations. How should we design algorithms under this partial observability? Should we decide (end-to-end RL) or distill (from a privileged expert)? We study this trade-off in locomotion. 🧵(1/n)

2

39

133

Dylan Foster 🐢

@canondetortugas

1 month

Excited to announce our NeurIPS ’25 tutorial: Foundations of Imitation Learning: From Language Modeling to Continuous Control With Adam Block & Max Simchowitz (@max_simchowitz)

6

50

359

Siqiao Huang

@KnightNemo_

1 month

To wrap up — world models are evolving fast, but they’re not the next LLMs. The real gold lies in video generation, generalist policies and integration of sensorimotor and abstraction. The full blog😎: 👉 https://t.co/yXRQ08iapW Would love to hear your takes — hype, hope, or

0

4

Siqiao Huang

@KnightNemo_

1 month

🗺️How About JEPA-Style World Models? LeCun’s JEPA may not be the final form of world models, but its latent-space learning idea is gold. Most modern video diffusion models already operate in latent space — using near-lossless VAEs as encoders. Future world models could co-train

1

0

2

Siqiao Huang

@KnightNemo_

1 month

🍫Physics vs Data: The Bitter Lesson Simulator = prior-driven. World model = data-driven. Given enough data, data-driven wins — always. But adding priors still boosts performance in narrow domains. It’s the classic tradeoff: generalization vs performance. So “physics-informed

1

0

3