barlas_berkeley Profile Banner
Barlas Oğuz Profile
Barlas Oğuz

@barlas_berkeley

Followers
46
Following
15
Media
0
Statuses
9

Research scientist, Meta FAIR. ex-MSFT, Berkeley alumni

Oakland, CA, USA
Joined March 2014
Don't wanna be here? Send us removal request.
@realJessyLin
Jessy Lin
17 days
🧠 How can we equip LLMs with memory that allows them to continually learn new things? In our new paper with @AIatMeta, we show how sparsely finetuning memory layers enables targeted updates for continual learning, w/ minimal interference with existing knowledge. While full
53
300
2K
@Yen_Ju_Lu
Yen-Ju Lu
30 days
🚀 Introducing the Latent Speech-Text Transformer (LST) — a speech-text model that organizes speech tokens into latent patches for better text→speech transfer, enabling steeper scaling laws and more efficient multimodal training ⚡️ Paper 📄 https://t.co/4nUsbC1YKF
7
16
31
@realJessyLin
Jessy Lin
2 months
🔍 How do we teach an LLM to 𝘮𝘢𝘴𝘵𝘦𝘳 a body of knowledge? In new work with @AIatMeta, we propose Active Reading 📙: a way for models to teach themselves new things by self-studying their training data. Results: * 𝟔𝟔% on SimpleQA w/ an 8B model by studying the wikipedia
15
156
1K
@realJessyLin
Jessy Lin
2 months
@AIatMeta And understanding how to teach models new things is increasingly impt – not just for training capable specialized models (e.g. AR as a practical technique for training personalized/expert models), but looking towards a continual learning paradigm where models keep acquiring new
2
3
31
@jaseweston
Jason Weston
3 months
...is today a good day for new paper posts? 🤖Learning to Reason for Factuality 🤖 📝: https://t.co/ss09xKGcAm - New reward func for GRPO training of long CoTs for *factuality* - Design stops reward hacking by favoring precision, detail AND quality - Improves base model across
1
50
384
@makingAGI
Guan Wang
4 months
🚀Introducing Hierarchical Reasoning Model🧠🤖 Inspired by brain's hierarchical processing, HRM delivers unprecedented reasoning power on complex tasks like ARC-AGI and expert-level Sudoku using just 1k examples, no pretraining or CoT! Unlock next AI breakthrough with
227
657
4K
@gargighosh
Gargi Ghosh
10 months
Last one of the year - EWE: https://t.co/D5y53ahtyX Ewe (Explicit Working Memory), enhances factuality in long-form text generation by integrating a working memory that receives real-time feedback from external resources.
2
22
102
@AIatMeta
AI at Meta
10 months
New research from Meta FAIR — Meta Memory Layers at Scale. This work takes memory layers beyond proof-of-concept, proving their utility at contemporary scale ➡️ https://t.co/0E952C2fJB
38
178
1K