Kaleem
@kaleemcs
Followers
543
Following
116K
Media
225
Statuses
7K
Asimov, an open-source humanoid.
0
0
1
Self-assembling flying robots! 🥏
Self-assembling flying robots! 🥏 Researchers from University of Pennsylvania introduced ModQuad, a modular aerial robot system where individual quadrotors can dock with each other while flying and then operate as a single cooperative structure. Each unit is a quadrotor
0
0
0
𝐃𝐞𝐱𝐭𝐞𝐫𝐨𝐮𝐬 𝐖𝐨𝐫𝐥𝐝 𝐌𝐨𝐝𝐞𝐥𝐬 (𝐃𝐖𝐌)
We introduce 🖐️🌏 𝐃𝐞𝐱𝐭𝐞𝐫𝐨𝐮𝐬 𝐖𝐨𝐫𝐥𝐝 𝐌𝐨𝐝𝐞𝐥𝐬 (𝐃𝐖𝐌) — a scene-action-conditioned video diffusion model that simulates human manipulation in static 3D scenes from egocentric hand motions. 📄 Paper: https://t.co/u2IldpfYTD 🌐 Project Page: https://t.co/YSE3rOmV1s
0
0
1
3D Gaussian Splatting
I have just delivered a web visualization platform based on 3D Gaussian Splatting to a client in China. A stable, lightweight, and scalable 3D platform was built using @playcanvas to display high-precision scene data from the real world. ✨ Project Highlights: 1.
0
0
1
Robots don’t wait. So why should your model? Large VLAs run in real time… with no training-time changes.
Robots don’t wait. So why should your model? Large VLAs run in real time… with no training-time changes. ❗️Worth reading if you’re working on real-world deployment of large models in robotics. I found this write-up on Real-Time Action Chunking (RTC) from Physical Intelligence,
0
0
0
A drone that flies, drives, and switches modes in 0.1 seconds:
A drone that flies, drives, and switches modes in 0.1 seconds: [Build it yourself: CAD + parts ⬇️] No extra actuators, no deformation, just clever mechanics and full control. DUAWLFIN is a ground-aerial robot with unified actuation: flying like a quadcopter, rolling like a
0
0
0
How can we run reconstruction models like π³ and Depth Anything 3 in real-time?
How can we run reconstruction models like π³ and Depth Anything 3 in real-time? We present KV-Tracker, a training-free approach, for real-time tracking of scenes and objects. Achieving up to 30 FPS! With @alzugarayign, @makezur, @XinKong_IC and @AjdDavison
0
0
0
LongVie 2, an end-to-end autoregressive video world model with: 🕹️ Strong Controllability 🎨 Long-term Visual Fidelity 🔒Temporal Consistency
✨ We introduce LongVie 2, an end-to-end autoregressive video world model with: 🕹️ Strong Controllability 🎨 Long-term Visual Fidelity 🔒Temporal Consistency - Page: https://t.co/XjFVgw4uf1 - Paper: https://t.co/mvRCUxYg5t
0
1
2
Intrinsic Image Fusion for Multi-View 3D Material Reconstruction
📢 Intrinsic Image Fusion for Multi-View 3D Material Reconstruction 📢 We combine generative material priors with inverse path tracing: 1) define a parametric texture space 2) fuse monocular predictions across views into consistent textures 3) optimize low-dimensional parameters
0
0
0
MicroCAD in the browser is becoming real.
MicroCAD in the browser is becoming real. I'm building a web-based μCAD editor where code is the model: - Write parametric geometry, compile, and instantly preview. - Export STL with no dependencies. - No heavyweight CAD. - An LLM chat to generate μCAD from text (still
1
0
2
Excited to present ACE-SLAM, the first neural SLAM to use Scene Coordinate Regression as an implicit map representation
Excited to present ACE-SLAM, the first neural SLAM to use Scene Coordinate Regression as an implicit map representation Efficient (real-time from live stream), compressive (neural maps <1MB) and robust to dynamic scenes With @marwan_ptr and @AjdDavison
https://t.co/tMsD5hTkB3
0
0
0
SAM Audio judge
While everyone is amazed by SAM audio, the hidden gem to me is the SAM Audio Judge! SAM Audio judge assesses how well a separated audio matches a given text description in terms of (1) overall quality (2) recall (3) precision (4) faithfulness. https://t.co/cpdr0dUAgP
0
0
0
Introducing Particulate: a feed-forward model for 3D object articulation 💻✂️👓🧳 Particulate gives you a fully articulated 3D object, including part segmentation, kinematic structure & motion constraints, in a single forward pass in ~10secs.
Introducing Particulate: a feed-forward model for 3D object articulation 💻✂️👓🧳 Particulate gives you a fully articulated 3D object, including part segmentation, kinematic structure & motion constraints, in a single forward pass in ~10secs. 🏅SOTA performance! 💡GenAI
0
0
1
Runs AI effects locally in Audacity
0
0
0
Real-time document layout detection based on YOLO-v10
0
0
0
Dataset release for RA-L 2025 paper "Towards Degradation-Robust High-Precision Mapping: A Large-Scale LiDAR–Inertial Dataset". https://t.co/m5uLlnNHZO…
github.com
CNITECH-CV-LAB has 6 repositories available. Follow their code on GitHub.
Dataset release for RA-L 2025 paper "Towards Degradation-Robust High-Precision Mapping: A Large-Scale LiDAR–Inertial Dataset". https://t.co/fYmbTwDu0x
0
0
0
15TB of physics simulation datasets https://t.co/6ocMM5qXdR…
0
0
0
MoCapAnything: Unified 3D Motion Capture for Arbitrary Skeletons from Monocular Videos
"MoCapAnything: Unified 3D Motion Capture for Arbitrary Skeletons from Monocular Videos" TL;DR: 3 learnable modules+lightweight IK stage: a Reference Prompt Encoder that distills per-joint queries from the asset’s skeleton, mesh, and rendered image set; (1/4)
0
0
0
Gaussian Splatting lecture:
Lecture slides for my "Introduction to #ComputerVision" and "#DeepLearning in Computer Vision" courses. 🆕 Gaussian Splatting 🆕 Flow Matching The included videos do not contain voiceovers yet, planned for a future revision.
0
0
0
Gaussian Splatting lecture content:
#KostasKeynoteLessons: Curious about the "Keynote magic" behind my slides? I’m releasing the full Keynote source file for my recent Gaussian Splatting lecture, all 10 GIGAbytes of it! Grab the files in the thread and feel free to remix.
0
0
0