corbtt Profile Banner
Kyle Corbitt Profile
Kyle Corbitt

@corbtt

Followers
14K
Following
5K
Media
191
Statuses
2K

Currently building @OpenPipeAI. Formerly @ycombinator, @google. I am always down to go on a quest.

Seattle, SF
Joined September 2012
Don't wanna be here? Send us removal request.
@corbtt
Kyle Corbitt
2 months
🚀 Meet ART·E—our open-source RL-trained email research agent that searches your inbox and answers questions more accurately, faster, and cheaper than o3. Let's go deeper on how we built it. 🧵
Tweet media one
38
120
956
@corbtt
Kyle Corbitt
19 hours
finished my claude code project with 3% context length to spare âś…
Tweet media one
4
0
7
@corbtt
Kyle Corbitt
20 hours
Proud startup moment: We interviewed Soham in February and he failed our tech screen. We have an insanely high bar at @OpenPipeAI and the most cracked team I've ever worked with.
Tweet media one
5
1
78
@corbtt
Kyle Corbitt
2 days
RT @LucasAtkins7: You can fake it pretty far in this industry just by saying, “Hrmm, that’s cool but I’m worried it won’t generalize,” when….
0
6
0
@corbtt
Kyle Corbitt
3 days
RT @0xTejpal: Most agent builders (ourselves included) realize within days or weeks that just prompting won't get you far. Incredibly bull….
0
3
0
@corbtt
Kyle Corbitt
3 days
It's becoming increasingly obvious that mass model customization is the future. Even big labs, which have traditionally pushed customers towards using prompting alone, are starting to bet big on fine-tuning. The gains are too huge to ignore.
@theinformation
The Information
4 days
Exclusive: OpenAI is mimicking Palantir in customizing AI models for customers spending $10 million or more. Read more from @AaronpHolmes and @SriMuppidi 👇 .
11
15
327
@corbtt
Kyle Corbitt
7 days
with bf16 of course. bf stands for better flops.
0
0
13
@corbtt
Kyle Corbitt
7 days
in the spirit of "you can just do things", you can, today, rent a 1-click cluster of 512 H100s on Lambda. that's more FLOPs than the largest supercomputer in the world as of 2022.
3
1
77
@corbtt
Kyle Corbitt
7 days
We're in the "IBM PC clone" era of LLM chat. One will win big, a few will survive, most will disappear.
@LukeW
Luke Wroblewski
7 days
everybody’s building the same thing.
Tweet media one
7
0
43
@corbtt
Kyle Corbitt
7 days
ART (agent reinforcement trainer) now supports multi-device training for faster training jobs!.
@bradthilton
Brad Hilton
7 days
done.
3
4
55
@corbtt
Kyle Corbitt
7 days
RT @casper_hansen_: This is great evidence that low-hanging fruit is everywhere in RL.
0
10
0
@corbtt
Kyle Corbitt
8 days
I wouldn't have said this 6 months ago, but I now believe all serious agents will be RL'd on their specific task. The gains are too easy and too huge to ignore. Either @OpenAI et. al. will provide APIs to do this on-platform, or open source will win.
25
34
654
@corbtt
Kyle Corbitt
8 days
RT @peytoncasper: ill pay someone $2k if they can use puffer lib or openpipes art framework to model a rl cua environment that converges us….
0
3
0
@corbtt
Kyle Corbitt
8 days
We've open sourced all code, data, and lessons learned. We also have a live demo available of the summarization agent. You can find all that in the full write-up here:
4
1
46
@corbtt
Kyle Corbitt
8 days
By directly optimizing on the number of questions that could be successfully answered from the summary, we taught Summary-RL what kinds of information to include. Within 30 training steps it already reached SOTA! ($22 to train)
Tweet media one
1
1
37
@corbtt
Kyle Corbitt
8 days
Why did we do this? LLMs are already good at generating summaries, but they don't always focus on the information you care about. RL lets you customize a model to focus specifically on the types of data you want to preserve.
1
1
25
@corbtt
Kyle Corbitt
8 days
Hot RL summer continues: we just released Summary-RL, an RL-trained summarization model that reaches SOTA on ServiceNow's Repliqa summarization benchmark!
Tweet media one
6
39
429
@corbtt
Kyle Corbitt
9 days
just so everybody is clear this is @sama's way of announcing that the openai open-source model will be around o3-mini level.
@sama
Sam Altman
9 days
what year do you think an o3-mini level model will run on a phone?.
13
3
383
@corbtt
Kyle Corbitt
9 days
Exciting evidence that RL can be incredibly sample efficient: when using GRPO to train a modified version of ART-E (agentic RAG task), we find that we're able to get qwen2.5-14b to exceed gemini 2.5 flash performance with 1 training scenario, and exceed o3 with just 16!. This
Tweet media one
11
21
267
@corbtt
Kyle Corbitt
9 days
Hope y'all are ready for hot RL summer.
4
3
80
@corbtt
Kyle Corbitt
13 days
if you're looking for a startup idea I'd love a sick genai-native craigslist replacement. - snap a picture and ai removes background, looks up item online, prepopulates price+listing. one button to post. - autoresponder. never answer the same question twice. - AI buyer's agent.
11
2
126