Salman Khan Profile
Salman Khan

@KhanSalmanH

Followers
1K
Following
2K
Media
106
Statuses
351

Faculty at MBZUAI. Past: Inception, ANU, Data61, NICTA, UWA.

Canberra, Australia
Joined February 2011
Don't wanna be here? Send us removal request.
@KhanSalmanH
Salman Khan
1 day
RT @Dr_HammadKhan: 🚨Do you have expertise in #RemoteSensing and #MachineLearning with a passion for #AgTech? Do you want to lead the develo….
0
4
0
@KhanSalmanH
Salman Khan
1 month
RT @alex_lacoste_: [#ICCV2025] Our paper "GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks" is accepted at ICCV 2025….
0
1
0
@KhanSalmanH
Salman Khan
2 months
🛰️ Benchmarked on GEO-Bench & Copernicus-Bench across classification, segmentation, and landslide mapping — with state-of-the-art performance. We hope the community will build on this work to further improve. Thanks to the wonderful team!. 4/4
Tweet media one
0
0
0
@KhanSalmanH
Salman Khan
2 months
🖥️ Model Key Highlights:.* TerraFM uses modality-specific patch embeddings, .* Aligned sensor modalities as natural augmentations, .* Incorporates cross-attention modality fusion for feature learning, .* Introduces a dual-centering loss to learn balanced, scalable features. 3/4
Tweet media one
1
0
0
@KhanSalmanH
Salman Khan
2 months
📦 Scale matters.• Trained on 18.7 million tiles of spatially aligned SAR and Sentinel-2 (L1C & L2A) data.• With large-sized tiles of 534×534 pixels, we pretrain the model on 23 trillion pixels!!.• Data sampled globally across diverse land cover types. 2/4
Tweet media one
1
0
0
@KhanSalmanH
Salman Khan
2 months
🌍 TerraFM: A Scalable Foundation Model for Multisensor Earth Observation. Excited to share TerraFM, a unified foundation model trained to understand our planet through SAR and optical satellite data. 📄 Paper: 💻 Code: 1/4
Tweet media one
1
0
2
@KhanSalmanH
Salman Khan
2 months
🔗 Paper: 🔬 Inference Code: 📌 Project Page: 💻 Training code coming soon!. Joint work with: Ghazi Shazan Ahmad, Ahmed Heakl, Hanan Ghani, Abdelrahman Shaker, Zhiqiang Shen, Ranjay Krishna, Fahad Khan.
Tweet card summary image
github.com
Official code of the paper "VideoMolmo: Spatio-Temporal Grounding meets Pointing" - mbzuai-oryx/VideoMolmo
0
0
2
@KhanSalmanH
Salman Khan
2 months
💡 Key Highlights on Data:.📊 New dataset: 72K video-caption pairs, 100K+ point labels.🧪 VPoS-Bench: Spans five real-world domains: Cell Tracking, Egocentric Vision, Autonomous Driving, Video-GUI Interaction, and Robotics. We also eval on Referring and Reasoning VOS tasks. 3/4
Tweet media one
1
0
0
@KhanSalmanH
Salman Khan
2 months
🗺️ Building on the Molmo framework, VideoMolmo is the first LMM tailored for fine-grained spatio-temporal pointing in videos, grounded in language prompts! . ✨Temporal attention module enables per-frame conditioning on prior context & Two-step grounding performs better. 2/4
Tweet media one
1
0
0
@KhanSalmanH
Salman Khan
2 months
🎥 From robotics to autonomous driving and cell tracking, knowing what, where, and when objects appear is key. VideoMolmo tackles this by first pointing to objects in space-time using language, then precisely segmenting them. 1/4
Tweet media one
1
1
14
@KhanSalmanH
Salman Khan
2 months
🌐 Project page: 📄 Paper: 🤗 Dataset: 💻 Code: Thanks to the amazing team: Hanoona Rasheed, Abdel-Rahman Shaker, Anqi Tang, Muhammad Maaz, Ming-Hsuan Yang, Salman Khan, Fahad Khan
Tweet media one
0
0
0
@KhanSalmanH
Salman Khan
2 months
Early results 👉 .•Models stumble as they miss a key frame or spoken detail. •Size helps, but architecture & training details matter more.•New open models now rival proprietary ones. 3/4
Tweet media one
1
0
0
@KhanSalmanH
Salman Khan
2 months
VideoMathQA: a benchmark that tests LLMs on educational videos rather than static screenshots or text-based math problems. Challenges:.•Dynamic visuals (changing diagrams, handwriting).•Long-range temporal reasoning.•Crossmodal grounding of video 🎥, audio 🎙️, text 📝. 2/4
Tweet media one
Tweet media two
1
0
0
@KhanSalmanH
Salman Khan
2 months
⚡Can LLMs watch a math lesson and reason like we do?. That means following evolving diagrams, spoken explanations, understanding symbols & logic, connecting them with subject knowledge, and handwritten notes — all at once. We try to answer it via the VideoMathQA benchmark. 1/4
1
1
4
@KhanSalmanH
Salman Khan
2 months
RT @vidllms: 🚀 Heading to #CVPR2025 in Nashville? Don’t miss the VideoLLMs Workshop on June 11 (Grand A1)!.• 🔑 6 keynotes.• 🗣️ Panel “Video….
0
8
0
@KhanSalmanH
Salman Khan
2 months
RT @SameeraRamasin1: In model parallel training, compressing the signal flow has been established as not very useful as they are too inform….
0
5
0
@KhanSalmanH
Salman Khan
2 months
RT @PluralisHQ: We've reached a major milestone in fully decentralized training: for the first time, we've demonstrated that a large langua….
0
267
0
@KhanSalmanH
Salman Khan
2 months
🚀 Introducing ThinkGeo: a benchmark for tool-augmented LLMs on real-world remote sensing tasks! 🌍🛰️ 436 tasks across key domains; 14 tools; step-by-step SOTA model eval. Open-source! 🔗#GeospatialAI
Tweet media one
0
0
3
@KhanSalmanH
Salman Khan
2 months
🌐 Links:. • Paper: • Code: • Data:
0
0
2
@KhanSalmanH
Salman Khan
2 months
We also specify the type of errors in the existing models' outputs. 👥 Thanks to the Team!.Tajamul Ashraf · Amal Saqib · Hanan Ghani · Muhra AlMahri · Yuhao Li · Noor Ahsan · Umair Nawaz · Jean Lahoud · Hisham Cholakkal · Mubarak Shah · Philip Torr · Fahad Khan · Anwer Rao. 4/4
Tweet media one
0
0
0