
Hong Liu
@HongLiu9903
Followers
286
Following
39
Media
7
Statuses
53
Co-founder, Lead Research @VoyageAI.
Joined October 2021
RT @dittycheria: We just launched Voyage-context-3, a new embedding model that gives AI a full-document view while preserving chunk-level p….
0
12
0
voyage-context-3 marks a paradigm shift to reduce reliability on chunking. The idea dated back to last year when @Yujie_Qian and I discussed how to embed contextual information without breaking VectorDBs. It turns out a new training objective is key to productionize the idea.
📢 voyage-context-3: contextualized chunk embeddings. - Auto captures of chunk level detail & global doc context, w/o metadata augmentation.- Beats OpenAI-v3-large by 14.24% & Cohere-v4 by 7.89%.- Binary 512-dim matches OpenAI (float, 3072-dim) in accuracy, but 192x cheaper in
0
0
5
6 / 6. Come see the future of multimodal embeddings!.@HaonanC80190.@luo_yuping . 📝 Paper: � Code: 🤖 Models: 📚 Datasets: 🌐 Homepage:
haon-chen.github.io
TWITTER BANNER DESCRIPTION META TAG
0
1
10
1/6 Introduce MoCa, a new method for continual pre-training of multimodal embeddings! 🚀. MoCa is the first to effectively scale with unlabeled interleaved image-text data, marking a paradigm shift in multimodal embeddings. Paper, code, & checkpoints! 👇.#AI #Multimodal #ML #NLP
1
40
139
🔥 Mind-blown by embedding model progress! In the past two months, we made voyage-3.5-lite outperform its 3x larger predecessor, voyage-3. The secret? Distilling from a larger model (voyage-3-large) is incredibly effective. The future of embeddings is here!.
📢 Meet voyage-3.5 and voyage-3.5-lite!.• flexible dim. and quantizations.• voyage-3.5 & 3.5-lite (int8, 2048 dim.) are 8% & 6% more accurate than OpenAI-v3-large, and 2.2x & 6.5x cheaper, resp. Also 83% less vectorDB cost! .• 3.5-lite ~ Cohere-v4 in quality, but 83% cheaper.
0
0
2
We trained voyage-code-3 back in last Nov. So far no other model is even close in code retrieval. Happy to see it shine in the brilliant @continuedev code assistants!.
@metcalfc wrote a deep dive on why your custom AI code assistant should include embeddings and a reranker from @VoyageAI🥇
0
0
3
Proud of the team for what we have achieved! Joining MongoDB opens a new chapter of innovations to reshape the landscape of information retrieval and semantic search.
The risk of hallucinations currently holds enterprises back from deploying AI apps. Excited to share that VoyageAI has joined MongoDB to make high-quality AI-powered search and retrieval easy, enabling organizations to build trustworthy AI apps at scale.
0
0
5
voyage-3-large embodies all insights we've learned along the way. It outperforms every model we tested on every type of retrieval task by a considerable margin.
📢 Announcing the new SOTA voyage-3-large embedding model!. • 9.74% over OpenAI and +20.71% over Cohere.• flexible dim. (256-2048) and quantizations (float, int8, binary).• 8.56% over OpenAI with 1/24x storage cost.• 1.16% over OpenAI with 1/192x storage cost ($10K → $52)
0
1
12
As the 1yr old voyage-code-2 is already unparalleled in code retrieval, voyage-code-3 pushes the boundaries even further.
Voyage created a total of 238 new high-quality reasoning-intensive code retrieval datasets that address the shortcomings of existing benchmarks (noisy labels, overly simplistic tasks, and data contamination). voyage-code-3 outperforms all other models in every group of datasets.
0
1
2