tmychow Profile Banner
trevor (taylor’s version) Profile
trevor (taylor’s version)

@tmychow

Followers
3K
Following
110
Media
191
Statuses
2K

diffusing the agi

um-helat
Joined July 2017
Don't wanna be here? Send us removal request.
@tmychow
trevor (taylor’s version)
4 months
openai launched gpt-4.5 to great excitement. yet under 2 months after launch, they announced it would be deprecated. is pre-training over now?. @jackgwhitaker and i show:.1. pre-training scaling laws haven't bent.2. why the marginal dollar left it for RL.3. why it'll come back
Tweet media one
9
12
155
@tmychow
trevor (taylor’s version)
19 hours
RT @taylorswift13: And, baby, that’s show business for you. New album The Life of a Showgirl. Out October 3  ❤️‍🔥. .
0
151K
0
@tmychow
trevor (taylor’s version)
3 days
what a way to be
Tweet media one
0
0
9
@tmychow
trevor (taylor’s version)
3 days
i explicitly pre-registered this before using it, but even so, i am shocked at how much i like gpt-5 when using it for coding. it doesn't hallucinate, it follows my instructions, and it does sensible non-reward-hacky things.
@tmychow
trevor (taylor’s version)
7 days
day 0 take is gpt-5 will punch above its weight in "doing stuff", compared to what you might think from the usual benchmarks. increasingly, what matters are instruction following / tool calling / long context / hallucination, which all score great. it's also priced very well.
0
0
11
@tmychow
trevor (taylor’s version)
3 days
another gold medal.
@SherylHsu02
Sheryl Hsu
3 days
1/n I’m thrilled to share that our @OpenAI reasoning system scored high enough to achieve gold 🥇🥇 in one of the world’s top programming competitions - the 2025 International Olympiad in Informatics (IOI) - placing first among AI participants! 👨‍💻👨‍💻
Tweet media one
0
0
7
@tmychow
trevor (taylor’s version)
3 days
RT @CalaverasAI: To our knowledge, we have sold more code tokens than any other data vendor. Follow this space. .
0
1
0
@tmychow
trevor (taylor’s version)
7 days
day 0 take is gpt-5 will punch above its weight in "doing stuff", compared to what you might think from the usual benchmarks. increasingly, what matters are instruction following / tool calling / long context / hallucination, which all score great. it's also priced very well.
0
1
11
@tmychow
trevor (taylor’s version)
9 days
i've known deniz and sherry since we were all at stanford, and they are among the grittiest and fastest moving founders i know - you should work for them!.
@kavi_deniz
Deniz Kavi
9 days
We are hiring!. I don't tend to talk about the success of @tamarindbio publicly, but we are experiencing incredible demand. Tens of thousands of scientists are regularly using the platform, and we're onboarding GPUs and people as fast as we can. We just posted multiple roles
0
0
9
@tmychow
trevor (taylor’s version)
9 days
recall:.
@tmychow
trevor (taylor’s version)
3 months
as a reminder:. claude 1 was good.claude 2 was bad.claude 3 was good.claude 3.5 was bad.claude 3.6 was good.claude 3.7 was bad.claude 4 is good.
0
0
0
@tmychow
trevor (taylor’s version)
9 days
unfortunately it stays true
Tweet media one
@tmychow
trevor (taylor’s version)
3 months
odd/even theory stays winning
Tweet media one
3
0
22
@tmychow
trevor (taylor’s version)
9 days
RT @zhengdongwang: in which i get to play my version of 'overrated or underrated' with tyler cowen. listen to tyler's full talk below https….
0
11
0
@tmychow
trevor (taylor’s version)
14 days
relative to many other notions of superhuman swe, the idea of a "bash-only superhuman coder" feels like the one we have the most line of sight to. glad to see it being benchmarked!. cc @sambrashears.
@_carlosejimenez
carlos
14 days
What happens if you compare LMs on SWE-bench without the fancy scaffolds?.Our new leaderboard “SWE-bench (bash only)” shows you which LMs are the best at getting the job done with just bash. More on why this is important 👇
Tweet media one
2
0
9
@tmychow
trevor (taylor’s version)
20 days
there are million dollar bills lying on the pavement.
@cis_female
sophia
21 days
I sold a 100B token code dataset to a bunch of AI companies recently. I'm looking to scale data things into a startup and as part of this I'm looking for people who deeply understand the browser platform to work on RL environments. DM me.
0
0
6
@tmychow
trevor (taylor’s version)
20 days
RT from 🔒."heard at icml: everyone *says* their top priority is data, but all their money is still going to architecture".
3
0
28
@tmychow
trevor (taylor’s version)
21 days
designarena [dot] ai is a good start, but really what i want is for someone to spend a month talking to really good designers and building an incredibly rich rubric based on that.
1
0
8
@tmychow
trevor (taylor’s version)
21 days
i wish we had frontend evals that were far more taste-loaded; even as automated coding is improving, most models continue to struggle to make me elegant and aesthetically pleasing frontends.
2
0
15
@tmychow
trevor (taylor’s version)
21 days
incidentally i think it is a very funny fact about the economics professions that a piece we first put out in jan 2023 is still a "recent working paper".
0
0
3
@tmychow
trevor (taylor’s version)
21 days
this week's cover story on the economist: what if AI made the world’s economic growth explode?
Tweet media one
Tweet media two
@currhenry
Henry Curr
21 days
I had a lot of fun researching and writing this piece: what if AI made the world’s economic growth explode?. The macroeconomics of an extraordinary thought experiment:.
2
0
27
@tmychow
trevor (taylor’s version)
24 days
great corrigibility
Tweet media one
1
0
16
@tmychow
trevor (taylor’s version)
24 days
i love misaligned claude
Tweet media one
6
1
77