
Jonathan Huang
@jonathanhuang11
Followers
204
Following
27
Media
0
Statuses
30
Stanford University
Joined November 2012
🎙️ @akapoor_av8r is building @genrobotics_ai to solve the biggest deployment problem in robotics: Getting real robots to work in the real world. In this episode, he shares how he’s doing it, and why most robotics stacks aren’t built to scale. We talk about growing up in India,
4
3
18
Unveiling agentic robotics on GRID –– our blueprint for machines that can reason, converse, compose, and remember. Agents have transformed software, now it's time for robotics. - Modular tools for perception, planning, and robot control. - Scalable, elastic infrastructure to
3
27
98
❓How can humanoids learn to squat and open a drawer? Reward-tuning for every such whole-body task is infeasible. 🚀Meet DreamControl: robots "dream" how people move and manipulate objects in varied scenarios, practice using them in simulation, and then act naturally in the
7
82
342
Last week, we shared DreamControl — a scalable framework for whole-body humanoid control that fuses diffusion priors with reinforcement learning to enable real-world scene interaction. Diffusion + RL → natural whole-body skills Policies run in real time → bridges sim-to-real
5
30
129
Very excited to share what we've been cooking up at @genrobotics_ai!
The power of generative models — now embodied in humanoids. Announcing DreamControl –– After a year-long research effort at General Robotics — we present a scalable framework for whole-body humanoid control that fuses diffusion priors with reinforcement learning to unlock
0
1
9
Robots don’t gain navigation skills by repeating one map. They learn by surviving many. Procedural generation gives us controlled variation — new layouts, clutter, and lighting with every run. This builds robustness and mitigates overfitting.
4
17
115
Big news: Scaled Foundations is now General Robotics! This name change better reflects our expanding vision and ambitious goals for the future of robotics. Exciting times ahead! #GeneralRobotics #ScaledFoundations #Robotics
Scaled Foundations is now General Robotics. We’re building general-purpose intelligence for every robot. Across any scenario in the physical world. Blog:
0
0
6
Exciting new work from @sihyun_yu and our team at Google Deep Mind! Memory-Augmented Latent Transformers (MALT) Diffusion, a new diffusion model specialized for long video generation! https://t.co/gcDZr5mVbf
arxiv.org
Diffusion models are successful for synthesizing high-quality videos but are limited to generating short clips (e.g., 2-10 seconds). Synthesizing sustained footage (e.g. over minutes) still...
3
17
111
Simulate multi-agent swarms, create digital twins with Gaussian Splatting, explore autonomous behaviors across diverse environments—and more. Meet AirGen, the evolution of AirSim. Blog: https://t.co/BnFVb6UqZQ Thread 🧵
1
18
71
Just another day at the office!
0
0
7
Can we leverage pre-trained encoders to kickstart training a diffusion model? Why yes... yes we can! Check out our recent work led by the fearless @sihyun_yu 🎉
Introducing REPA! We show that learning high-quality representations in diffusion transformers is crucial for boosting generation performance. With REPA, we speed up SiT training by 17.5x (without CFG) and achieve state-of-the-art FID = 1.42 using CFG with the guidance interval.
0
0
5
Trained in house at Scaled Foundations @ScaFoAI !
Announcing MatMamba - an elastic Mamba2🐍architecture with🪆Matryoshka-style training and adaptive inference. Train a single elastic model, get 100s of nested submodels for free! Paper: https://t.co/MIJeaWmwYE Code: https://t.co/wPB8frMeFF 🧵(1/10)
0
0
7
Deep insights on the state of the robotics industry by the one and only Ashish Kapoor! @akapoor_av8r
7 lessons from AirSim: I ran the autonomous systems and robotics research effort at Microsoft for nearly a decade and here are our biggest learnings. A thread 🧵 Complete blog:
0
0
3
Try out @allen_ai’s Molmo VLM on Open GRID now! VLMs like Molmo bring a rich layer of semantic knowledge to robots - allowing them to respond to user queries and interpret complex environments with ease. Scale autonomous AI solutions with state-of-the-art AI models on GRID today!
Meet Molmo: a family of open, state-of-the-art multimodal AI models. Our best model outperforms proprietary systems, using 1000x less data. Molmo doesn't just understand multimodal data—it acts on it, enabling rich interactions in both the physical and virtual worlds. Try it
1
7
18
Introducing GRID Enterprise A private GRID experience that is scalable, customizable, and seamlessly integrated into your dev pipeline. 🔗: https://t.co/k3P8N0qgKK 🧵(1/5)
4
20
80
We've opened the waitlist for General Robot Intelligence Development (GRID) Beta! Accelerate robotics dev with our open, free & cloud-based IDE. Zero setup needed. Develop & deploy advanced skills with foundation models and rapid prototyping 🔗: https://t.co/jhWZCmxNjb 🧵(1/6)
5
92
272
Congratulations to the authors of "VideoPoet: A Large Language Model for Zero-Shot Video Generation" for winning one of this year's @icmlconf Best Paper Awards! #ICML2024 Paper: https://t.co/JinpikSveV Blog post: https://t.co/jdqehGqWW6
9
50
266
Introducing the Auto Arborist Dataset, a multiview urban tree classification dataset that consists of ~2.6M trees and >320 genera, which can aid in the development of models for urban forest monitoring. Check it out at → https://t.co/uldYCWuCs3
6
85
323
Most #DeepLearning approaches for instance segmentation rely on labeled datasets that are difficult to collect. Today we present an approach that achieves state-of-the-art performance in the partially-supervised setting, which requires less labeled data ↓
research.google
Posted by Vighnesh Birodkar, Research Software Engineer and Jonathan Huang, Research Scientist, Google Research Instance segmentation is the task o...
6
107
316
Training an object detector using Cloud Machine Learning Engine https://t.co/Pya2VfBtiw via @GCPBigData
0
2
2