Barlas Oğuz
@barlas_berkeley
Followers
46
Following
15
Media
0
Statuses
9
Research scientist, Meta FAIR. ex-MSFT, Berkeley alumni
Oakland, CA, USA
Joined March 2014
🧠 How can we equip LLMs with memory that allows them to continually learn new things? In our new paper with @AIatMeta, we show how sparsely finetuning memory layers enables targeted updates for continual learning, w/ minimal interference with existing knowledge. While full
53
300
2K
🚀 Introducing the Latent Speech-Text Transformer (LST) — a speech-text model that organizes speech tokens into latent patches for better text→speech transfer, enabling steeper scaling laws and more efficient multimodal training ⚡️ Paper 📄 https://t.co/4nUsbC1YKF
7
16
31
🔍 How do we teach an LLM to 𝘮𝘢𝘴𝘵𝘦𝘳 a body of knowledge? In new work with @AIatMeta, we propose Active Reading 📙: a way for models to teach themselves new things by self-studying their training data. Results: * 𝟔𝟔% on SimpleQA w/ an 8B model by studying the wikipedia
15
156
1K
@AIatMeta And understanding how to teach models new things is increasingly impt – not just for training capable specialized models (e.g. AR as a practical technique for training personalized/expert models), but looking towards a continual learning paradigm where models keep acquiring new
2
3
31
...is today a good day for new paper posts? 🤖Learning to Reason for Factuality 🤖 📝: https://t.co/ss09xKGcAm - New reward func for GRPO training of long CoTs for *factuality* - Design stops reward hacking by favoring precision, detail AND quality - Improves base model across
1
50
384
🚀Introducing Hierarchical Reasoning Model🧠🤖 Inspired by brain's hierarchical processing, HRM delivers unprecedented reasoning power on complex tasks like ARC-AGI and expert-level Sudoku using just 1k examples, no pretraining or CoT! Unlock next AI breakthrough with
227
657
4K
Last one of the year - EWE: https://t.co/D5y53ahtyX Ewe (Explicit Working Memory), enhances factuality in long-form text generation by integrating a working memory that receives real-time feedback from external resources.
2
22
102
New research from Meta FAIR — Meta Memory Layers at Scale. This work takes memory layers beyond proof-of-concept, proving their utility at contemporary scale ➡️ https://t.co/0E952C2fJB
38
178
1K