KnightNemo_ Profile Banner
Siqiao Huang Profile
Siqiao Huang

@KnightNemo_

Followers
570
Following
240
Media
30
Statuses
142

Junior undergrad, Yao class @Tsinghua_Uni . Current intern @mldcmu. Interested in ML & Robotics. World Models / VLAs / Humanoid Foundation Models.

Joined August 2024
Don't wanna be here? Send us removal request.
@sainingxie
Saining Xie
22 hours
papers are kind of like movies: the first one is usually the best, and the sequels tend to get more complicated but not really more exciting. But that totally doesn’t apply to the DepthAnything series. @bingyikang's team somehow keeps making things simpler and more scalable each
@bingyikang
Bingyi Kang
1 day
After a year of team work, we're thrilled to introduce Depth Anything 3 (DA3)! 🚀 Aiming for human-like spatial perception, DA3 extends monocular depth estimation to any-view scenarios, including single images, multi-view images, and video. In pursuit of minimal modeling, DA3
5
29
398
@KnightNemo_
Siqiao Huang
2 days
When Dreamerv4 came out, the two takeaways for me are: 1. Diffusion Forcing / Streaming Video Gen techniques will be the mainstream algorithm choice in WMs 2. The gap between Video Generation Models and World Models is becoming increasingly small. If we have a good enough Video
Tweet card summary image
arxiv.org
World models, which predict future transitions from past observation and action sequences, have shown great promise for improving data efficiency in sequential decision-making. However, existing...
@danijarh
Danijar Hafner
5 days
Excited for this podcast episode with TalkRL to be out! 🎙️ We talk about the story behind Dreamer 4, the details of scalable world models, and the future of robotics (and beyond) 🤖🌏🚀 Thanks for the fun conversation, @TalkRLPodcast
2
8
83
@KnightNemo_
Siqiao Huang
5 days
“The Limits of My World means the Limits of My Language” —— Siqiao Huang, Nov. 2025.
@drfeifei
Fei-Fei Li
5 days
“The philosopher Wittgenstein once wrote that “the limits of my language mean the limits of my world.” I’m not a philosopher. But I know at least for AI, there is more than just words. Spatial intelligence represents the frontier beyond language—the capability that links
0
0
17
@KnightNemo_
Siqiao Huang
8 days
There are some projects that are cool, some that are significant. But every once in a while, something like this comes across— and I just lean back in my chair and think, “Damn.” Congrats @li_yitang on this amazing project!!!
@li_yitang
Yitang Li
8 days
Meet BFM-Zero: A Promptable Humanoid Behavioral Foundation Model w/ Unsupervised RL👉 https://t.co/3VdyRWgOqb 🧩ONE latent space for ALL tasks ⚡Zero-shot goal reaching, tracking, and reward optimization (any reward at test time), from ONE policy 🤖Natural recovery & transition
2
5
23
@danijarh
Danijar Hafner
12 days
Today is my last day at @GoogleDeepMind. After almost exactly 10 years at Google including 12 internships and the last 2 1/2 years full time, it really feels like a chapter coming to an end. I'm grateful for all the experiences and friends I've made at Google and DeepMind. I
146
51
2K
@KnightNemo_
Siqiao Huang
12 days
Thanks @EmbodiedAIRead @yilun_chen_ for featuring our repo!!!
@EmbodiedAIRead
Embodied AI Reading Notes
13 days
Awesome World Models Github:  https://t.co/IBANoRMmIA Newly released one-stop github repo on everything about World Modeling, spanning definition, theory, general approaches, use cases and evaluations in Embodied AI (as well as in other domains like NLP, Agent, etc). Organized
0
0
17
@KnightNemo_
Siqiao Huang
15 days
@JinWeiyang18434 Btw, I really liked the picture that Nano-Banana🍌 @GeminiApp generated🤣. It integrates the elements seamlessly, generative models nowadays are just super wild. From Left to Right: - Genie3 blogpost picture @jparkerholder @shlomifruchter - @ylecun 's renowned brain picture -
0
1
8
@KnightNemo_
Siqiao Huang
15 days
This repo covers key papers and research on World Models across multiple domains, including Embodied AI, Autonomous Driving, NLP, and more. If you find it useful, please give it a star ⭐! PRs are always welcome. 🔗: https://t.co/DDG1wU9WRB Shoutout to @JinWeiyang18434 , and we
Tweet card summary image
github.com
A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling. - knightnemo/Awesome-World-Mo...
1
2
21
@KnightNemo_
Siqiao Huang
15 days
Introducing🌍 Awesome-World-Models, a one-stop github repo of everything there is to know about world models! Here is a new, curated one-stop resource list for everyone interested in "World Models," aiming to be a go-to guide for researchers and developers in the field. 🧵(1/n)
15
103
664
@KnightNemo_
Siqiao Huang
15 days
Thanks for sharing! 😎
@JinWeiyang18434
Weiyang Jin
15 days
https://t.co/bV7KMwCTvx Credit to @KnightNemo_ the WM reading list worth to checking!
0
0
2
@KnightNemo_
Siqiao Huang
20 days
Personally, I'm more interested in latent WMs. But since nobody is mentioning it, here are why pixel space also makes sense: 1. One's purpose determines one's standpoint. For sequential decision making, pixel space makes no sense; but for game simulation, pixel is everything. 2.
@ChongZitaZhang
C Zhang
21 days
On world model / egocentric visual dynamics model, also on building robotic simulation, also on building robotic genAI models: Being visually realistic doesn't mean being physically accurate and semantically correct.
6
2
87
@abhishekunique7
Abhishek Gupta
22 days
Punchline: World models == VQA (about the future)! Planning with world models can be powerful for robotics/control. But most world models are video generators trained to predict everything, including irrelevant pixels and distractions. We ask - what if a world model only
12
69
404
@KnightNemo_
Siqiao Huang
22 days
World Models in Game Simulations are cool. But the real challenge is using it to advance robotics. This comes in two folds: 1. As a source of data for policy training 2. As a verifier for tts and policy evaluation Glad to see both aspects coming into play in this awesome work.
@GYanjiang
Yanjiang Guo
23 days
Rollouts in the real world are slow and expensive. What if we could rollout trajectories entirely inside a world model (WM)? Introducing 🚀Ctrl-World🚀, a generative manipulation WM that can interact with advanced VLA policy in imagination. 🧵1/6
0
0
15
@KnightNemo_
Siqiao Huang
27 days
Not sure if now is the best time to do world model research, but it surely is good times for making world model memes🤣
11
40
376
@yus167
Yuda Song
1 month
🤖 Robots rarely see the true world's state—they operate on partial, noisy visual observations. How should we design algorithms under this partial observability? Should we decide (end-to-end RL) or distill (from a privileged expert)? We study this trade-off in locomotion. 🧵(1/n)
2
39
133
@canondetortugas
Dylan Foster 🐢
1 month
Excited to announce our NeurIPS ’25 tutorial: Foundations of Imitation Learning: From Language Modeling to Continuous Control With Adam Block & Max Simchowitz (@max_simchowitz)
6
50
359
@KnightNemo_
Siqiao Huang
1 month
To wrap up — world models are evolving fast, but they’re not the next LLMs. The real gold lies in video generation, generalist policies and integration of sensorimotor and abstraction. The full blog😎: 👉 https://t.co/yXRQ08iapW Would love to hear your takes — hype, hope, or
0
0
4
@KnightNemo_
Siqiao Huang
1 month
🗺️How About JEPA-Style World Models? LeCun’s JEPA may not be the final form of world models, but its latent-space learning idea is gold. Most modern video diffusion models already operate in latent space — using near-lossless VAEs as encoders. Future world models could co-train
1
0
2
@KnightNemo_
Siqiao Huang
1 month
🍫Physics vs Data: The Bitter Lesson Simulator = prior-driven. World model = data-driven. Given enough data, data-driven wins — always. But adding priors still boosts performance in narrow domains. It’s the classic tradeoff: generalization vs performance. So “physics-informed
1
0
3