
Ferjad Naeem
@ferjadnaeem
Followers
912
Following
1K
Media
10
Statuses
333
Research Scientist @Google
Zürich, Switzerland
Joined May 2010
A big congratulations to the whole Gemini team on pushing this amazing family of models out 😄.Our tech report is out now: Feels a bit unreal to share the contributors list with all the amazing colleagues.
Hot Gemini updates off the press. 🚀. Anyone can now use 2.5 Flash and Pro to build and scale production-ready AI applications. 🙌. We’re also launching 2.5 Flash-Lite in preview: the fastest model in the 2.5 family to respond to requests, with the lowest cost too. 🧵
0
1
10
RT @jacklangerman: Active Data Curation Effectively Distills Large-Scale Multimodal Models. - compute per sample loss with large batch.- on….
0
4
0
Stop by this amazing work from Vishaal and the team today at CVPR.
Our ACID paper showing how you can use active data curation as an effective way to pretrain super-strong smol and efficient VL-encoders. Poster #361 in the Poster Hall from 10:30 AM - 12:30 PM on Saturday, 14th June.
0
1
9
RT @wightmanr: timm's got a new vision transformer (NaFlexVit), and it's flexible! I've been plugging away at this for a bit, integrating i….
0
38
0
RT @sundarpichai: At #GoogleIO, we shared how decades of AI research have now become reality. From a total reimagining of Search to Agent….
0
2K
0
RT @mtschannen: 📢 We just released the code for JetFormer at. Enjoy!.
github.com
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more. - google-research/big_vision
0
60
0
RT @mtschannen: We are presenting JetFormer at ICLR this morning, poster #190. Stop by if you’re interested in unified multimodal architect….
0
31
0
RT @andrefaraujo: Google's global PhD Fellowship program will open for applications this week! (on Apr 10th). This supports PhD students in….
research.google
0
1
0
Check out the strongest open-source dense prediction models from our colleagues!.
📢📢 We released checkpoints and Pytorch/Jax code for TIPS: Paper updated with distilled models, and more:. #ICLR2025.
0
0
5
RT @rgilman33: The majority of features in this layer of Siglip-2 are multimodal. I'd expected some multimodality but was surprised that t….
0
25
0
Fully supportive of this. Machine Learning/ Computer Vision review process is broken with irresponsible reviewers. Glad to see there is some accountability.
#CVPR2025 Area Chairs (ACs) identified a number of highly irresponsible reviewers, those who either abandoned the review process entirely or submitted egregiously low-quality reviews, including some generated by large language models (LLMs). 1/2.
0
0
5
Delighted to share that ACED has been accepted at CVPR2025! Check out our work to know how to distill the strongest smol size image-text contrastive models.
Check out our latest work that explores data curation as a paradigm to learn compute efficient image text contrastive models.Had a blast collaborating across Google, Deepmind, Tuebingen and Cambridge on this work.
0
0
16
RT @mtschannen: 📢2⃣ Yesterday we released SigLIP 2! . TL;DR: Improved high-level semantics, localization, dense features, and multilingual….
0
34
0
Excited to share what we have been up to in image text embedding models. SigLIP 2 is the most powerful encoder for most open vocabulary computer vision and MMLLM tasks. Checkpoints are open sourced and we look forward to what the community achieves with these.
Introducing SigLIP2: now trained with additional captioning and self-supervised losses!. Stronger everywhere: .- multilingual.- cls. / ret. - localization.- ocr.- captioning / vqa. Try it out, backward compatible!. Models: Paper:
2
4
47
RT @haiyang73756134: Excited to see our paper "Tokenformer: Rethinking transformer scaling with tokenized model parameters" accepted as a s….
0
45
0
RT @XiaohuaZhai: Ever thought of training multimodal models with 100 billion 🚀 unique examples? . Check out WebLI-100B! The study reveals e….
0
18
0
Joint work with the amazing team: @haiyang73756134 , Fan Yue, @xyongqin , @janericlenssen , Liwei Wang, @fedassa and Bernt Schiele.
0
0
4
RT @fedassa: Google Internship call: we are looking for a PhD student in the area of object/scene understanding and VLMs to join our Google….
0
40
0
RT @GoogleDeepMind: Welcome to the world, Gemini 2.0 ✨ our most capable AI model yet. We're first releasing an experimental version of 2.0….
0
431
0