
Yi Luan
@YiLuan9
Followers
79
Following
1
Media
0
Statuses
6
Can we train a text instructed image retrieval model purely from synthetic data? .The answer is Yes! .MagicLens is the *first* image retrieval model that supports open-ended textual instructions! .Unlock multisearch capabilities with SOTA performance and 50× smaller model! 🚀🚀.
1⃣🔍MagicLens is selected as #ICML Oral (1.5%)🌟.2⃣ Inference code and models are available 🚀 🔍MagicLens reaches SOTA on 10 benchmarks with text, image, or multimodal input. It is the first image retrieval model that takes open-ended textual.
0
1
8
Very happy to contribute to the Multimodal benchmarking on the LOFT project! Very excited to see with fewshot prompting only, Gemini 1.5 pro can already outperform CLIP on Coco, Flickr, by a large margin on 1M context tokens!.
Can long-context language models (LCLMs) subsume retrieval, RAG, SQL, and more?. Introducing LOFT: a benchmark stress-testing LCLMs on million-token tasks like retrieval, RAG, and SQL. Surprisingly, LCLMs rival specialized models trained for these tasks!.
0
4
12
RT @mandarjoshi_: Excited to present Pix2Struct! It's a general-purpose pixel-to-text model that can be finetuned on tasks with visually-si….
0
76
0
RT @zifeishan: Multilingual🌏 Entity Linking (EL) datasets are biased towards popular entities, overestimating EL systems' performance. We….
0
1
0
RT @ZhuyunDai: New from Google Research! Retrieval tasks are quite different -- Few-shot adaptation is important! With 8 annotated examples….
0
84
0
RT @bhuwandhingra: 🤔 When does a factoid question need a *long* answer?.🤖 "Long" could mean multiple things: either you ask for a city with….
arxiv.org
An abundance of datasets and availability of reliable evaluation metrics have resulted in strong progress in factoid question answering (QA). This progress, however, does not easily transfer to...
0
15
0