kliu128 Profile Banner
Kevin Liu Profile
Kevin Liu

@kliu128

Followers
10K
Following
6K
Media
53
Statuses
557

Interested in ai, systems, progress, living a good life! Preparedness at @openai, previously @stanford '24

cot token #42,443
Joined August 2016
Don't wanna be here? Send us removal request.
@kliu128
Kevin Liu
11 days
our progress in automating software engineering is truly exciting to behold
0
0
5
@mpopv
Matt Popovich
1 month
@MissionLoco
MissionLoco
1 month
A cat ran in front of a car and was run over. This happens 26 MILLION times per year in the US. Now @rachelswan and @JackieFielder_ want to ban vehicles. This is how moronic @Hearst @sfchronicle reporters & @sfbos are. It’s like they speak in tongues in a tents in Mississippi.
20
162
4K
@kliu128
Kevin Liu
1 month
Rhythm, Linden, and Yash are awesome people solving a good problem. Excited for what they’ll achieve.
@appliedcompute
Applied Compute
1 month
Generalists are useful, but it’s not enough to be smart. Advances come from specialists, whether human or machine. To have an edge, agents need specific expertise, within specific companies, built on models trained on specific data. We call this Specific Intelligence. It's
0
0
4
@MSFTResearch
Microsoft Research
14 days
Coming December 9 at 8:00 AM PT, our last Microsoft Research Forum episode of the year.  Register now:
19
37
522
@willdepue
will depue
1 month
after a long break, i’m back at openai to start a new team with @troyluhman and @eric_luhman1 focused on an incredibly high-risk bet that has a small, but significant, chance of leading to ASI. we’re keeping the team tight but we’re open to high-slope researchers & engineers
98
38
1K
@FedeItaliano76
Federico Italiano
9 months
Air traffic control at Berlin's Tempelhof Central Airport, 1987
16
227
2K
@kliu128
Kevin Liu
1 month
personal favorite is the samosa burger w fries
0
0
3
@ZulobaSt
Zuloba
2 months
Just like stepping on clouds with our Cashmere Slipper Home Socks! Perfect for chilly days around the house. Your new best friend for staying warm and stylish. Get Your Today!
0
101
1K
@kliu128
Kevin Liu
1 month
Might be the best work SemiAnalysis has done so far
@SemiAnalysis_
SemiAnalysis
1 month
Zareen is one of the go to places for many SF Bay Area AI researchers to get a quick bite. Most of the food is very good and was even on the Michelin guide in 2020. AI researchers not experienced with the Indian cuisine will commonly order their chicken tikka masala with garlic
2
0
7
@kliu128
Kevin Liu
2 months
tbh the only app subscription worth paying for is flighty
1
0
9
@LinkofSunshine
Basil🧡
2 months
When nyc built the 7 train, like 75% of it was literally just through cornfields!
65
220
5K
@LHSummers
Lawrence H. Summers
2 months
A research team at @OpenAI, where I am proud to be a board member, released an important new paper today. This paper looks at what might be thought of as task specific Turing Tests and shows that AI systems, even with limited guidance, perform many tasks -- such as planning
@OpenAI
OpenAI
2 months
Today we’re introducing GDPval, a new evaluation that measures AI on real-world, economically valuable tasks. Evals ground progress in evidence instead of speculation and help track how AI improves at the kind of work that matters most. https://t.co/uKPPDldVNS
90
236
2K
@tejalpatwardhan
Tejal Patwardhan
2 months
this plot is wild right??
@tejalpatwardhan
Tejal Patwardhan
2 months
We also find that, when paired with human oversight, models have the potential to complete work tasks much faster and cheaper than humans alone.
16
14
253
@kliu128
Kevin Liu
2 months
It's worth reading. https://t.co/W9sGUcHbB0 It's been amazing to see the entire team work day and night on it, and I think it'll contribute significantly to our understanding of how LLMs affect work.
Tweet card summary image
openai.com
We’re introducing GDPval, a new evaluation that measures model performance on economically valuable, real-world tasks across 44 occupations.
0
0
4
@kliu128
Kevin Liu
2 months
GDPval tests models on 1,320 well specified tasks from 44 real knowledge work occupations, written by experts with an average of 14 years of experience in their field.
@tejalpatwardhan
Tejal Patwardhan
2 months
Understanding the capabilities of AI models is important to me. To forecast how AI models might affect labor, we need methods to measure their real-world work abilities. That’s why we created GDPval.
1
2
12
@kliu128
Kevin Liu
2 months
Marinade mania
@JeffLadish
Jeffrey Ladish
2 months
There are some truly wild reasoning traces in @apolloaievals & OpenAI's recent paper The models appear to have developed specific uses for the words "marinade" "overshadow" "illusions" "vantage" and others. This seems likely to be the result of RL training
0
0
3
@playcarmageddon
Carmageddon: Rogue Shift
4 days
— ANNOUNCING CARMAGEDDON: ROGUE SHIFT — Combat racing is back & more brutal than ever. Do you have what it takes to get in the driver's seat?
273
751
5K
@kliu128
Kevin Liu
3 months
coding agents are like the sf central subway: unclear productivity (ridership) gains (yet) but makes people feel warm and fuzzy inside when using it which is valuable in and of itself
0
0
2
@kliu128
Kevin Liu
3 months
waiting to clean the kitchen until generalist home robotics arive
2
0
15
@kliu128
Kevin Liu
3 months
emergent learning of multi agent workflows, colorized, 2004
0
0
4
@kliu128
Kevin Liu
3 months
have we considered that humans also struggle with long horizon tasks?
2
0
43
@kliu128
Kevin Liu
3 months
First edition swe bench plot shirts
@tejalpatwardhan
Tejal Patwardhan
3 months
ok, if you want a shirt: $17 and all funds raised are donated to @cleanaircatf https://t.co/NsLZIrYf2s
1
0
8