Kevin Wei Profile
Kevin Wei

@kevinlwei

Followers
1K
Following
16K
Media
5
Statuses
62

Science of AI evaluations + U.S. AI policy @RANDCorporation | @Harvard_Law '26, @SchwarzmanOrg '23, @GTOMSCS '22 | Views mine only 🏳️‍🌈 🎉

New York, USA
Joined July 2013
Don't wanna be here? Send us removal request.
@kevinlwei
Kevin Wei
13 days
We wrote a paper last year about all the ways industry orgs could influence policy . tl;dr: unsurprisingly, there's lots of places you could spend money to influence policy, and industry is massively outspending civil society orgs on AI.
Tweet card summary image
arxiv.org
Industry actors in the United States have gained extensive influence in conversations about the regulation of general-purpose artificial intelligence (AI) systems. Although industry participation...
@Miles_Brundage
Miles Brundage
2 months
AI industry lobbying + PACs will be the most well funded in history, making it all the more important to pass federal legislation soon before the process is completely corrupted.
1
3
18
@kevinlwei
Kevin Wei
19 days
RT @janet_e_egan: Selling H20 (and potentially Blackwell?) chips to China gives up valuable leverage. @ohlennart and I argue there's a smar….
0
41
0
@kevinlwei
Kevin Wei
24 days
I think I also had a high bar for "interdisciplinary" position papers. If the basis of your position is arguments from law, economics, sociology, etc., then I expect you to actually engage with that literature, not just throw around some keywords and citations!.
0
0
3
@kevinlwei
Kevin Wei
24 days
Strong +1, my pile also had papers that read like technical papers but without experiments/theory/data, very odd/confusing to me as that's not the point of a position paper imo. (My scores were 1, 2, 3, 3, 10 - the 10 was very good, and I hope it gets an award).
@RishiBommasani
rishi
28 days
I finished my reviews for the NeurIPS position track with an average score of 2/10 and top score of 3/10. I support publishing position papers at AI venues, but authors (and reviewers) should realize that the purpose isn't a shortcut for publishing second-rate work at NeurIPS. .
1
0
5
@kevinlwei
Kevin Wei
1 month
I'm the Submissions Editor this year, which means I manage the entire submissions pipeline. Feel free to email me at jolt.submissions@gmail.com with questions.
0
0
0
@kevinlwei
Kevin Wei
1 month
We've also just revamped our website with lots more information! New on the site: . - Details about our review process.- A data retention policy.- An AI usage policy (tl;dr: OK to use AI if you disclose, and you're responsible for any errors).
1
0
0
@kevinlwei
Kevin Wei
1 month
As of today, submissions for @HarvardJOLT's spring issue are open! . We're looking for law review articles related to law and technology (defined very broadly). Articles can be doctrinal, empirical, historical, philosophical, etc. Scholastica link is in the thread :).
1
3
6
@kevinlwei
Kevin Wei
2 months
RT @michael__aird: 🚀Come join my team at RAND!. We’re looking for research leads, researchers, & project managers for our compute, US AI po….
0
12
0
@kevinlwei
Kevin Wei
2 months
RT @evaluatingevals: 🚨 AI Evals Crisis: Officially kicking off the Eval Science Workstream 🚨 . We’re building a shared scientific foundati….
Tweet card summary image
evalevalai.com
Announcing the launch of a research-driven initiative among a community of researchers to strengthen the science of AI evaluations.
0
7
0
@kevinlwei
Kevin Wei
2 months
RT @daniel_d_kang: As AI agents near real-world use, how do we know what they can actually do? Reliable benchmarks are critical but agentic….
0
32
0
@kevinlwei
Kevin Wei
2 months
And shoutout to all our coauthors @SunishchalDev , @m_j_byun, @AnkaReuel , @xave_rg , Rachel Calcott, @EvieCoxon, @chinmay_deshp !.
0
0
4
@kevinlwei
Kevin Wei
2 months
We then systematically reviewed 115 human baseline studies and found substantial shortcomings:. * The median sample size is 8 people.* 98% lack statistical power analysis.* 67% only report point entimates (no SD or intervals).* 78% and 59% don't make data or code available 😭😭😭.
1
0
2
@kevinlwei
Kevin Wei
2 months
We look at measurement theory from the social sciences to write recommendations for more rigorous human baselines. We also produce a reporting checklist to help make results/methods more transparent.
Tweet media one
1
0
2
@kevinlwei
Kevin Wei
2 months
Human baselines add important context to AI evals: ML researchers need them to assess performance differences, users can check them for adoption decisions, and policymakers can use them to understand risk and economic impact. But most human baselines aren't good enough for this!.
1
0
2
@kevinlwei
Kevin Wei
2 months
🚨 New paper alert! 🚨. Are human baselines rigorous enough to support claims about "superhuman" performance?. Spoiler alert: often not!. @prpaskov and I will be presenting our spotlight paper at ICML next week on the state of human baselines + how to improve them!
Tweet media one
1
8
20
@kevinlwei
Kevin Wei
3 months
RT @law_ai_: 📢 Last Call for Applications! Apply by May 31 to join one of our three in-person events this summer:. 📆 Summer Institute on La….
0
6
0
@kevinlwei
Kevin Wei
4 months
RT @adnhw: Really excited to share my first ever paper! “Third-party compliance reviews for AI safety frameworks” 🚀. See below for more ⬇️….
0
15
0
@kevinlwei
Kevin Wei
5 months
RT @lawfare: "By establishing state data commons, policymakers can help ensure that AI’s benefits extend to all communities, advancing the….
0
2
0