dacapo_go Profile Banner
dacapo Profile
dacapo

@dacapo_go

Followers
53
Following
888
Media
93
Statuses
994

Learning how machines learn / There's no free lunch

San Francisco, CA
Joined September 2023
Don't wanna be here? Send us removal request.
@dacapo_go
dacapo
1 day
Are we going to have a new ARC-AGI bench every year ?
@IntuitMachine
Carlos E. Perez
3 days
Everyone says LLMs can't do true reasoning—they just pattern-match and hallucinate code. So why did our system just solve abstract reasoning puzzles that are specifically designed to be unsolvable by pattern matching? Let me show you what happens when you stop asking AI for
0
0
0
@dacapo_go
dacapo
2 days
The amount of information in a single 1h video by @mrdbourke is crazy high https://t.co/mpPlMh8V5Y
0
0
0
@natolambert
Nathan Lambert
19 days
We present Olmo 3, our next family of fully open, leading language models. This family of 7B and 32B models represents: 1. The best 32B base model. 2. The best 7B Western thinking & instruct models. 3. The first 32B (or larger) fully open reasoning model. This is a big
92
361
2K
@dacapo_go
dacapo
18 days
Had a debugging session with Gemini 3 pro. During 10+mins I let it fix my brew update problem. I got sick of the back and forth and simply found the root cause myself in 1 google query + 1 command line.
0
0
0
@TheZachMueller
Zach Mueller
21 days
Another chart to be placed on the wall (src; PyTorch TorchTitan paper)
2
11
143
@dacapo_go
dacapo
21 days
Tomorrow is a new day, it's just part of the process. We'll eventually win :)
0
0
0
@dacapo_go
dacapo
21 days
Rejection email after plenty of rounds hits hard :(
1
0
0
@basvanopheusden
basvanopheusden
25 days
A few months ago now, I wrote a document about my experiences interviewing for AI research jobs before eventually joining @OpenAI. This doc details my process and lessons learned. Hope it's helpful! https://t.co/jH1mK4Nii5
11
53
905
@giffmana
Lucas Beyer (bl16)
1 month
> There’s no free lunch. > When you reduce the complexity of attention, you pay a price. > The question is, where? This is *exactly* how I typically end my Transformer tutorial. This slide is already 4 years old, I've never updated it, but it still holds:
@zpysky1125
Pengyu Zhao
1 month
MiniMax M2 Tech Blog 3: Why Did M2 End Up as a Full Attention Model? On behave of pre-training lead Haohai Sun. ( https://t.co/WH4xOD9KrT) I. Introduction As the lead of MiniMax-M2 pretrain, I've been getting many queries from the community on "Why did you turn back the clock
35
61
906
@dacapo_go
dacapo
2 months
BTW if you're looking for a job as ML Engineer in LLM / NLP hit me up :)
@dacapo_go
dacapo
2 months
My team is currently trying to hire for ML Engineers, and ... oh boy looking at HR work from the inside is even worse than I thought
1
0
0
@dacapo_go
dacapo
2 months
feels bad for the crack that have been rejected without any consideration :(
0
0
0
@dacapo_go
dacapo
2 months
My team is currently trying to hire for ML Engineers, and ... oh boy looking at HR work from the inside is even worse than I thought
1
0
0
@dacapo_go
dacapo
2 months
At this point someone needs to make this will hunting meme with all the Karpathy and Sutton takes
0
0
0
@dacapo_go
dacapo
2 months
0
0
0
@dacapo_go
dacapo
2 months
You won't believe what you'll achieve in the next 20 years
@forloopcodes
forloop
2 months
chat did i cook?
0
0
1
@dacapo_go
dacapo
2 months
Feels like bait but I always thought these were "golden prompts" for LLM
@hyhieu226
Hieu Pham
2 months
Wait, do people actually prompt LLMs by starting with things like: "You are an expert programmer ..." or "NEVER EVER do something" with the hope that the models will treat follow those statements more obediently? (and does it work?) 😅
0
0
0
@dacapo_go
dacapo
2 months
Still, massive work by Google once again
0
0
1
@JustinLin610
Junyang Lin
2 months
some models next week
129
55
2K
@dacapo_go
dacapo
2 months
"They built a voice search model that doesn’t understand words it understands intent." I don't think that's "unthinkable", plenty of papers on the subject since 2018 and probably earlier too
@ChrisLaubAI
Chris Laub
2 months
Google just did the unthinkable. They built a voice search model that doesn’t understand words it understands intent. It’s called Speech-to-Retrieval (S2R), and it might mark the death of speech-to-text forever. Here’s how it works (and why it matters way more than it sounds)
1
2
5
@dacapo_go
dacapo
2 months
JetBrains >>>>>>>> VSCode
0
0
0