
Sabri Eyuboglu
@EyubogluSabri
Followers
1K
Following
1K
Media
15
Statuses
446
Working on language model memory. CS PhD student @Stanford working with @HazyResearch and @james_y_zou. 🪬
Joined February 2019
When we put lots of text (eg a code repo) into LLM context, cost soars b/c of the KV cache’s size. What if we trained a smaller KV cache for our documents offline? Using a test-time training recipe we call self-study, we find that this can reduce cache memory on avg 39x
13
73
303
RT @krandiash: Instead of using ChatGPT, I’m increasingly using Claude Code for non code queries as well, including long form writing, anal….
0
3
0
RT @_khaledsaab: After two amazing years @GoogleDeepMind, I’m now joining @OpenAI to accelerate biomedical intelligence with @thekaransingh….
0
35
0
RT @cartesia_ai: Introducing Line by Cartesia: the modern voice agent development platform. Line was built to be code-first, because best-i….
0
53
0
RT @DimitrisPapail: Thinking about model generalization is quite painful. We observe empirically that models trained with SGD on cross-en….
0
56
0
RT @oshaikh13: If you thought referencing past chats was cool, we built an MCP that lets Claude use *anything you see or do on your compute….
0
33
0
RT @krandiash: Excited to see this release from @ShreyaR, they’ve really created a magical experience for builders with Snowglobe. Evals an….
0
1
0
RT @ShreyaR: Introducing ❄️ @snowglobe_so, the simulation engine for AI chatbots. Magically simulate the behavior of your users to test an….
0
87
0
RT @kalomaze: @teortaxesTex i keep thinking who will stop fucking around and productionize ICL context distillation (i.e Cartridges paper)….
0
2
0
RT @EricTopol: Who do you call when you need to design novel, potent nanobodies vs a pathogen?.The virtual lab of A.I. agents @Nature @jame….
0
50
0
RT @james_y_zou: ⚡️Thrilled that #VirtualLab is published in @Nature! We created a team of AI agents to mirror my….
0
250
0
RT @jordanjuravsky: Check out Tokasaurus on Modal to make Llama-1B brrr! This repeated sampling example shows off two engine features that….
0
8
0
RT @charles_irl: Tokasaurus, the "little LLM engine that could" by @jordanjuravsky and @EyubogluSabri of @HazyResearch/@ScalingIntelLab, is….
0
9
0
RT @ryansehrlich: Thank you for the kind words -- we can't either! We're really excited about models learning new things and remembering t….
0
7
0
RT @bio_bootloader: Cartridges could be this "missing learning paradigm" Karpathy talks about. 1) agent does tasks, collects memories that….
0
5
0
RT @ESFoMo: Looking forward to seeing everyone for ES-FoMo part three tomorrow! We'll be in East Exhibition Hall A (the big one), and we've….
0
22
0
RT @realDanFu: ES-FoMo is back tomorrow! Come join is in East Exhibition Hall A bright and early at 8:30AM for a great slate of invited tal….
0
2
0