Guillem Ramírez
@Guillemram
Followers
28
Following
8
Media
3
Statuses
12
🚨 Before Sam puts personalized ads in your AI chats… Take our 5 min survey & discover what LLMs actually know about you! 🤖💡 Your responses will help build better AI privacy safeguards.
1
1
4
The next EuroLLM model is out 🎉 We support all the 🇪🇺 EU languages (+ more), but now in a 9B size (base and instruct). We are not done yet; stay tuned for more 👀
Today we release EuroLLM-9B: the best EU-made multilingual LLM of its size! Check the blog post for more info and results: https://t.co/jjuSqXzpFk. Stay tuned for the technical report and bigger and more powerful models!
0
6
22
These papers appeared the same day on arxiv and present nearly the same method, wild! Cache & Distil: Optimising API Calls to LLMs https://t.co/8RbuMgTjlL Cache me if you Can: an Online Cost-aware Teacher-Student Framework to Reduce the Calls to LLMs https://t.co/NqWo3jKV3I
10
42
168
We release the code to encourage more work on optimising LLM API calls https://t.co/oYMddc8EgS
github.com
Contribute to guillemram97/neural-caching development by creating an account on GitHub.
0
0
1
This problem, deciding whether to call the LLM, shares some similarities with Active Learning. We benchmark and analyse classic Active Learning criteria, finding that some of them can be useful to save LLM calls.
1
0
1
In this work, we introduce the 'neural caching' problem: given a request from a user, we obtain the answer using a student model. Then, a policy decides whether to call the LLM to correct the annotation under budget-constrained conditions.
1
0
1
Using GPT-4 but the calls are expensive? Distilling your past queries into a student model may help. Introducing: 'Cache & Distil: Optimising API Calls to Large Language Models'
arxiv.org
Large-scale deployment of generative AI tools often depends on costly API calls to a Large Language Model (LLM) to fulfil user queries. To curtail the frequency of these calls, one can employ a...
1
19
38