Guillem Ramírez Profile
Guillem Ramírez

@Guillemram

Followers
28
Following
8
Media
3
Statuses
12

PhD student in NLP, UoE

Edinburgh
Joined August 2023
Don't wanna be here? Send us removal request.
@Guillemram
Guillem Ramírez
1 month
🚨 Before Sam puts personalized ads in your AI chats… Take our 5 min survey & discover what LLMs actually know about you! 🤖💡 Your responses will help build better AI privacy safeguards.
1
1
4
@m_klimasz
Mateusz Klimaszewski
1 year
The next EuroLLM model is out 🎉 We support all the 🇪🇺 EU languages (+ more), but now in a 9B size (base and instruct). We are not done yet; stay tuned for more 👀
@PedroHenMartins
Pedro Martins
1 year
Today we release EuroLLM-9B: the best EU-made multilingual LLM of its size! Check the blog post for more info and results: https://t.co/jjuSqXzpFk. Stay tuned for the technical report and bigger and more powerful models!
0
6
22
@DimitrisPapail
Dimitris Papailiopoulos
2 years
These papers appeared the same day on arxiv and present nearly the same method, wild! Cache & Distil: Optimising API Calls to LLMs https://t.co/8RbuMgTjlL Cache me if you Can: an Online Cost-aware Teacher-Student Framework to Reduce the Calls to LLMs https://t.co/NqWo3jKV3I
10
42
168
@Guillemram
Guillem Ramírez
2 years
We release the code to encourage more work on optimising LLM API calls https://t.co/oYMddc8EgS
Tweet card summary image
github.com
Contribute to guillemram97/neural-caching development by creating an account on GitHub.
0
0
1
@Guillemram
Guillem Ramírez
2 years
This problem, deciding whether to call the LLM, shares some similarities with Active Learning. We benchmark and analyse classic Active Learning criteria, finding that some of them can be useful to save LLM calls.
1
0
1
@Guillemram
Guillem Ramírez
2 years
In this work, we introduce the 'neural caching' problem: given a request from a user, we obtain the answer using a student model. Then, a policy decides whether to call the LLM to correct the annotation under budget-constrained conditions.
1
0
1
@Guillemram
Guillem Ramírez
2 years
Using GPT-4 but the calls are expensive? Distilling your past queries into a student model may help. Introducing: 'Cache & Distil: Optimising API Calls to Large Language Models'
Tweet card summary image
arxiv.org
Large-scale deployment of generative AI tools often depends on costly API calls to a Large Language Model (LLM) to fulfil user queries. To curtail the frequency of these calls, one can employ a...
1
19
38