dacapo
@dacapo_go
Followers
53
Following
888
Media
93
Statuses
994
Learning how machines learn / There's no free lunch
San Francisco, CA
Joined September 2023
Are we going to have a new ARC-AGI bench every year ?
Everyone says LLMs can't do true reasoning—they just pattern-match and hallucinate code. So why did our system just solve abstract reasoning puzzles that are specifically designed to be unsolvable by pattern matching? Let me show you what happens when you stop asking AI for
0
0
0
The amount of information in a single 1h video by @mrdbourke is crazy high https://t.co/mpPlMh8V5Y
0
0
0
We present Olmo 3, our next family of fully open, leading language models. This family of 7B and 32B models represents: 1. The best 32B base model. 2. The best 7B Western thinking & instruct models. 3. The first 32B (or larger) fully open reasoning model. This is a big
92
361
2K
Had a debugging session with Gemini 3 pro. During 10+mins I let it fix my brew update problem. I got sick of the back and forth and simply found the root cause myself in 1 google query + 1 command line.
0
0
0
Another chart to be placed on the wall (src; PyTorch TorchTitan paper)
2
11
143
Tomorrow is a new day, it's just part of the process. We'll eventually win :)
0
0
0
A few months ago now, I wrote a document about my experiences interviewing for AI research jobs before eventually joining @OpenAI. This doc details my process and lessons learned. Hope it's helpful! https://t.co/jH1mK4Nii5
11
53
905
> There’s no free lunch. > When you reduce the complexity of attention, you pay a price. > The question is, where? This is *exactly* how I typically end my Transformer tutorial. This slide is already 4 years old, I've never updated it, but it still holds:
MiniMax M2 Tech Blog 3: Why Did M2 End Up as a Full Attention Model? On behave of pre-training lead Haohai Sun. ( https://t.co/WH4xOD9KrT) I. Introduction As the lead of MiniMax-M2 pretrain, I've been getting many queries from the community on "Why did you turn back the clock
35
61
906
feels bad for the crack that have been rejected without any consideration :(
0
0
0
My team is currently trying to hire for ML Engineers, and ... oh boy looking at HR work from the inside is even worse than I thought
1
0
0
At this point someone needs to make this will hunting meme with all the Karpathy and Sutton takes
0
0
0
You won't believe what you'll achieve in the next 20 years
0
0
1
Feels like bait but I always thought these were "golden prompts" for LLM
Wait, do people actually prompt LLMs by starting with things like: "You are an expert programmer ..." or "NEVER EVER do something" with the hope that the models will treat follow those statements more obediently? (and does it work?) 😅
0
0
0
"They built a voice search model that doesn’t understand words it understands intent." I don't think that's "unthinkable", plenty of papers on the subject since 2018 and probably earlier too
Google just did the unthinkable. They built a voice search model that doesn’t understand words it understands intent. It’s called Speech-to-Retrieval (S2R), and it might mark the death of speech-to-text forever. Here’s how it works (and why it matters way more than it sounds)
1
2
5