dacapo @dacapo_go X Profile

dacapo

@dacapo_go

Followers

53

Following

888

Media

93

Statuses

994

Learning how machines learn / There's no free lunch

San Francisco, CA

Joined September 2023

Don't wanna be here? Send us removal request.

dacapo

@dacapo_go

1 day

Are we going to have a new ARC-AGI bench every year ?

Carlos E. Perez

@IntuitMachine

3 days

Everyone says LLMs can't do true reasoning—they just pattern-match and hallucinate code. So why did our system just solve abstract reasoning puzzles that are specifically designed to be unsolvable by pattern matching? Let me show you what happens when you stop asking AI for

0

dacapo

@dacapo_go

2 days

The amount of information in a single 1h video by @mrdbourke is crazy high https://t.co/mpPlMh8V5Y

0

Nathan Lambert

@natolambert

19 days

We present Olmo 3, our next family of fully open, leading language models. This family of 7B and 32B models represents: 1. The best 32B base model. 2. The best 7B Western thinking & instruct models. 3. The first 32B (or larger) fully open reasoning model. This is a big

92

361

2K

dacapo

@dacapo_go

18 days

Had a debugging session with Gemini 3 pro. During 10+mins I let it fix my brew update problem. I got sick of the back and forth and simply found the root cause myself in 1 google query + 1 command line.

0

Zach Mueller

@TheZachMueller

21 days

Another chart to be placed on the wall (src; PyTorch TorchTitan paper)

2

11

143

dacapo

@dacapo_go

21 days

Tomorrow is a new day, it's just part of the process. We'll eventually win :)

0

dacapo

@dacapo_go

21 days

Rejection email after plenty of rounds hits hard :(

1

0

basvanopheusden

@basvanopheusden

25 days

A few months ago now, I wrote a document about my experiences interviewing for AI research jobs before eventually joining @OpenAI. This doc details my process and lessons learned. Hope it's helpful! https://t.co/jH1mK4Nii5

11

53

905

Lucas Beyer (bl16)

@giffmana

1 month

> There’s no free lunch. > When you reduce the complexity of attention, you pay a price. > The question is, where? This is *exactly* how I typically end my Transformer tutorial. This slide is already 4 years old, I've never updated it, but it still holds:

Pengyu Zhao

@zpysky1125

1 month

MiniMax M2 Tech Blog 3: Why Did M2 End Up as a Full Attention Model? On behave of pre-training lead Haohai Sun. ( https://t.co/WH4xOD9KrT) I. Introduction As the lead of MiniMax-M2 pretrain, I've been getting many queries from the community on "Why did you turn back the clock

35

61

906

dacapo

@dacapo_go

2 months

BTW if you're looking for a job as ML Engineer in LLM / NLP hit me up :)

dacapo

@dacapo_go

2 months

My team is currently trying to hire for ML Engineers, and ... oh boy looking at HR work from the inside is even worse than I thought

1

0

dacapo

@dacapo_go

2 months

feels bad for the crack that have been rejected without any consideration :(

0

dacapo

@dacapo_go

2 months

My team is currently trying to hire for ML Engineers, and ... oh boy looking at HR work from the inside is even worse than I thought

1

0

dacapo

@dacapo_go

2 months

At this point someone needs to make this will hunting meme with all the Karpathy and Sutton takes

0

dacapo

@dacapo_go

2 months

https://t.co/AQ42zYzRyk

0

dacapo

@dacapo_go

2 months

You won't believe what you'll achieve in the next 20 years

forloop

@forloopcodes

2 months

chat did i cook?

0

1

dacapo

@dacapo_go

2 months

Feels like bait but I always thought these were "golden prompts" for LLM

Hieu Pham

@hyhieu226

2 months

Wait, do people actually prompt LLMs by starting with things like: "You are an expert programmer ..." or "NEVER EVER do something" with the hope that the models will treat follow those statements more obediently? (and does it work?) 😅

0

dacapo

@dacapo_go

2 months

Still, massive work by Google once again

0

1

Junyang Lin

@JustinLin610

2 months

some models next week

129

55

2K

dacapo

@dacapo_go

2 months

"They built a voice search model that doesn’t understand words it understands intent." I don't think that's "unthinkable", plenty of papers on the subject since 2018 and probably earlier too

Chris Laub

@ChrisLaubAI

2 months

Google just did the unthinkable. They built a voice search model that doesn’t understand words it understands intent. It’s called Speech-to-Retrieval (S2R), and it might mark the death of speech-to-text forever. Here’s how it works (and why it matters way more than it sounds)

1

2

5

dacapo

@dacapo_go

2 months

JetBrains >>>>>>>> VSCode

0