_kevinlu Profile Banner
Kevin Lu Profile
Kevin Lu

@_kevinlu

Followers
7K
Following
1K
Media
14
Statuses
49

formerly: - @openai: RL, synthetic data, efficient models - @weHRTyou: sequential decision making - @berkeley_ai: decision transformer, universal computation

SF 🏳️‍🌈
Joined October 2020
Don't wanna be here? Send us removal request.
@_kevinlu
Kevin Lu
10 months
Come check out o1-mini: SoTA math reasoning in a small package with @ren_hongyu @shengjia_zhao @Eric_Wallace_ & the rest of the OpenAI team
Tweet media one
15
29
261
@_kevinlu
Kevin Lu
3 days
I dream of having a rich set of economically valuable RL tasks to train on, as wide and beautiful as the internet. Does this pipeline for task creation look like. robotics? trading? enterprise metrics? research? coding? recommendation? video games?. I think it's the absolute
Tweet media one
11
8
187
@_kevinlu
Kevin Lu
3 days
The internet sets up a natural curriculum of skills -- people gradually add new ideas on top of the old ones -- ensuring the models have a smooth difficulty ramp to learn from that covers the skill space. Curriculum will be important for RL -- you need to learn sub-skills before
Tweet media one
1
3
70
@_kevinlu
Kevin Lu
3 days
The internet is incredibly diverse, and it is sourced from data on topics which. humans actually cared about to engage with in the first place. There are low-resource languages and niche fanbases that will be forever immortalized in AGI because someone cared enough to document.
1
3
72
@_kevinlu
Kevin Lu
3 days
Ultimately, we want AGI that benefits and interacts with humans, not just something that lives in a toy cage (like AlphaZero for chess, or reasoning models in math). In contrast to other researchers, I think it is therefore imperative to work on product. Are researchers who say
Tweet media one
1
3
93
@_kevinlu
Kevin Lu
3 days
I really like this diagram from @_jasonwei and @hwchung27 about how to view the bitter lesson:. It's a mistake not to add structure now, it's a mistake to not remove that structure later. We're at the precipice of setting up a huge, powerful RL training run that will define the.
3
6
125
@_kevinlu
Kevin Lu
3 days
Why you should stop working on RL research and instead work on product //.The technology that unlocked the big scaling shift in AI is the internet, not transformers. I think it's well known that data is the most important thing in AI, and also that researchers choose not to work
Tweet media one
28
138
1K
@_kevinlu
Kevin Lu
10 days
one of the things we need for ai to mature is a more efficient capital allocation:. - what if i buy a large cluster but find no use for it?.- what if i sign a nine figure contract with a company that fails to deliver me a good model?.- what if I pay $100M for a researcher who.
@stevenydc
Steven Yin
10 days
AI companies are the new utilities. Compute goes in → intelligence comes out → distribute through APIs. But unlike power companies who can stockpile coal, and hedge natural gas futures, OpenAI can't stockpile compute. Every idle GPU second = money burned. Use it or lose it.
0
1
22
@_kevinlu
Kevin Lu
12 days
ps. one of the reasons i joined openai was because i thought there were a lot of cool usecases (like video games) that required cheap + fast intelligence, so I thought I should join and release it 🙃.
0
0
24
@_kevinlu
Kevin Lu
12 days
ok enough tweeting from me, but if you are interested in also some thoughts about how we can think of inference in ways that is not simply "make the cot longer", i am advertising my blog 🙂.
1
1
29
@_kevinlu
Kevin Lu
12 days
more of a fun meme, but i think most people are also talking about inference compute, but not enough about inference time. generally in life, we pay extra for things to be fast. but right now the small-model-tier (flash, mini, nano, haiku) are both cheap and fast. it's easy to
Tweet media one
2
1
32
@_kevinlu
Kevin Lu
12 days
So I think something else that doesn't get discussed much is the extrapolation of this inference : training trend. - 2015: back in the day, we would train one model per dataset, and inference it once (to obtain the eval result for our paper).- 2020: with chatgpt, multi-task
Tweet media one
7
25
222
@_kevinlu
Kevin Lu
13 days
By the way, I think some of my old blogs are still relevant too --. There are a lot of threads around inference time compute, small models, etc. these days; if you are interested in my thoughts (and the human in the loop), you could take a read :).
Tweet media one
1
1
20
@_kevinlu
Kevin Lu
13 days
A somewhat little known fact about me is that I have a blog 😀. Over the weekend I got around to writing up some of my thoughts on the recent LLM-Pokemon craze, and why I think video games are more interesting than most (maybe older) AI researchers think. - Why is Pokemon hard,
Tweet media one
8
8
169
@_kevinlu
Kevin Lu
13 days
RT @karpathy: The race for LLM "cognitive core" - a few billion param model that maximally sacrifices encyclopedic knowledge for capability….
0
1K
0
@_kevinlu
Kevin Lu
3 months
RT @jilin_14: Exciting to share what i've been working on in the past few months! o3 and o4-mini are our first reasoning models with full t….
0
14
0
@_kevinlu
Kevin Lu
3 months
0
0
7
@_kevinlu
Kevin Lu
3 months
RT @SuvanshSanjeev: we trained a cute lil model!.- will solve a few AIME problems a year if you ask nicely.- cheap image input, 1 million t….
0
5
0
@_kevinlu
Kevin Lu
3 months
Today we released GPT-4.1 nano, an amazing effort led by @johnohallman and @SuvanshSanjeev!. Some cool features of today's release:.- Faster & cheaper than 4o-mini.- Significantly cheaper for image processing.- Better reasoning across the board.- 1M input context
Tweet media one
3
4
81
@_kevinlu
Kevin Lu
5 months
We released o3-mini, available today to all users in ChatGPT (for free)!. o3-mini-low is faster (and often better) than o1-mini, and o3-mini-high is the most capable publicly available reasoning model in the world. with @ren_hongyu @shengjia_zhao
Tweet media one
35
65
663
@_kevinlu
Kevin Lu
7 months
We trained o3-mini: both more capable than o1-mini, and around 4x faster end-to-end when accounting for reasoning tokens. with @ren_hongyu @shengjia_zhao & others
Tweet media one
11
18
320