Ferjad Naeem Profile
Ferjad Naeem

@ferjadnaeem

Followers
912
Following
1K
Media
10
Statuses
333

Research Scientist @Google

Zürich, Switzerland
Joined May 2010
Don't wanna be here? Send us removal request.
@ferjadnaeem
Ferjad Naeem
2 months
A big congratulations to the whole Gemini team on pushing this amazing family of models out 😄.Our tech report is out now: Feels a bit unreal to share the contributors list with all the amazing colleagues.
@GoogleDeepMind
Google DeepMind
2 months
Hot Gemini updates off the press. 🚀. Anyone can now use 2.5 Flash and Pro to build and scale production-ready AI applications. 🙌. We’re also launching 2.5 Flash-Lite in preview: the fastest model in the 2.5 family to respond to requests, with the lowest cost too. 🧵
0
1
10
@ferjadnaeem
Ferjad Naeem
2 months
RT @jacklangerman: Active Data Curation Effectively Distills Large-Scale Multimodal Models. - compute per sample loss with large batch.- on….
0
4
0
@grok
Grok
2 days
Generate videos in just a few seconds. Try Grok Imagine, free for a limited time.
743
3K
10K
@ferjadnaeem
Ferjad Naeem
2 months
Stop by this amazing work from Vishaal and the team today at CVPR.
@vishaal_urao
Vishaal Udandarao
2 months
Our ACID paper showing how you can use active data curation as an effective way to pretrain super-strong smol and efficient VL-encoders. Poster #361 in the Poster Hall from 10:30 AM - 12:30 PM on Saturday, 14th June.
0
1
9
@ferjadnaeem
Ferjad Naeem
2 months
RT @wightmanr: timm's got a new vision transformer (NaFlexVit), and it's flexible! I've been plugging away at this for a bit, integrating i….
0
38
0
@ferjadnaeem
Ferjad Naeem
3 months
RT @sundarpichai: At #GoogleIO, we shared how decades of AI research have now become reality. From a total reimagining of Search to Agent….
0
2K
0
@ferjadnaeem
Ferjad Naeem
4 months
RT @mtschannen: We are presenting JetFormer at ICLR this morning, poster #190. Stop by if you’re interested in unified multimodal architect….
0
31
0
@ferjadnaeem
Ferjad Naeem
4 months
RT @andrefaraujo: Google's global PhD Fellowship program will open for applications this week! (on Apr 10th). This supports PhD students in….
Tweet card summary image
research.google
0
1
0
@ferjadnaeem
Ferjad Naeem
5 months
Check out the strongest open-source dense prediction models from our colleagues!.
@kmaninis
Kevis-Kokitsi Maninis
5 months
📢📢 We released checkpoints and Pytorch/Jax code for TIPS: Paper updated with distilled models, and more:. #ICLR2025.
0
0
5
@ferjadnaeem
Ferjad Naeem
5 months
RT @rgilman33: The majority of features in this layer of Siglip-2 are multimodal. I'd expected some multimodality but was surprised that t….
0
25
0
@ferjadnaeem
Ferjad Naeem
6 months
Fully supportive of this. Machine Learning/ Computer Vision review process is broken with irresponsible reviewers. Glad to see there is some accountability.
@CVPR
#CVPR2025
6 months
#CVPR2025 Area Chairs (ACs) identified a number of highly irresponsible reviewers, those who either abandoned the review process entirely or submitted egregiously low-quality reviews, including some generated by large language models (LLMs). 1/2.
0
0
5
@ferjadnaeem
Ferjad Naeem
6 months
Delighted to share that ACED has been accepted at CVPR2025! Check out our work to know how to distill the strongest smol size image-text contrastive models.
@ferjadnaeem
Ferjad Naeem
8 months
Check out our latest work that explores data curation as a paradigm to learn compute efficient image text contrastive models.Had a blast collaborating across Google, Deepmind, Tuebingen and Cambridge on this work.
0
0
16
@ferjadnaeem
Ferjad Naeem
6 months
RT @mtschannen: 📢2⃣ Yesterday we released SigLIP 2! . TL;DR: Improved high-level semantics, localization, dense features, and multilingual….
0
34
0
@ferjadnaeem
Ferjad Naeem
6 months
Excited to share what we have been up to in image text embedding models. SigLIP 2 is the most powerful encoder for most open vocabulary computer vision and MMLLM tasks. Checkpoints are open sourced and we look forward to what the community achieves with these.
@XiaohuaZhai
Xiaohua Zhai
6 months
Introducing SigLIP2: now trained with additional captioning and self-supervised losses!. Stronger everywhere: .- multilingual.- cls. / ret. - localization.- ocr.- captioning / vqa. Try it out, backward compatible!. Models: Paper:
Tweet media one
2
4
47
@ferjadnaeem
Ferjad Naeem
6 months
RT @haiyang73756134: Excited to see our paper "Tokenformer: Rethinking transformer scaling with tokenized model parameters" accepted as a s….
0
45
0
@ferjadnaeem
Ferjad Naeem
6 months
RT @XiaohuaZhai: Ever thought of training multimodal models with 100 billion 🚀 unique examples? . Check out WebLI-100B! The study reveals e….
0
18
0
@ferjadnaeem
Ferjad Naeem
6 months
Joint work with the amazing team: @haiyang73756134 , Fan Yue, @xyongqin , @janericlenssen , Liwei Wang, @fedassa and Bernt Schiele.
0
0
4
@ferjadnaeem
Ferjad Naeem
6 months
Very happy to share that our work "TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters" has been accepted to ICLR2025 as a spotlight. Tokenformer proposes a new way to incrementally scale transformer training.
5
3
52
@ferjadnaeem
Ferjad Naeem
7 months
RT @fedassa: Google Internship call: we are looking for a PhD student in the area of object/scene understanding and VLMs to join our Google….
0
40
0
@ferjadnaeem
Ferjad Naeem
8 months
RT @GoogleDeepMind: Welcome to the world, Gemini 2.0 ✨ our most capable AI model yet. We're first releasing an experimental version of 2.0….
0
431
0