leo 🐾
@synthwavedd
Followers
5K
Following
20K
Media
400
Statuses
2K
I'm excited to introduce Hieroglyph, a new benchmark for lateral reasoning. Hieroglyph measures a model's ability to identify the link between seemingly unrelated and often niche subjects. On the 20-question set of the hardest Only Connect questions, no model scores above 50%.
44
37
576
it is heartbreaking how often of an occurrence these incidents have become, and shameful that there has been such little effort to combat it may they rest in peace
We are very sorry to share that we have confirmed reports of two deceased victims from the active shooting situation at the Barus & Holley engineering building. There are eight additional victims in critical, but stable condition at the hospital. There remains a shelter in place
1
0
8
GPT-5.2 xhigh in Codex would rather spend a week rewriting your kernel to fix a bug than admit it can't do it
21
2
210
Time for a new one of these What's your current favourite model overall?
15
2
52
Time for a new one of these What's your current favourite model overall?
15
2
52
Claude Opus 4.5 is still SoTA on SWE-Bench Verified Total Dario supremacy
13
2
103
Interesting - native image output, new most recent knowledge cutoff in any model (Aug 31 2025), and more expensive than GPT-5 Looks like a new base
12
5
134
GPT-5.2 introduces an 'xhigh' reasoning effort parameter
2
7
124
Deep Research is now available on the API, powered by Gemini 3 Pro It sets a new SoTA on HLE and DeepSearchQA - impressive!
2
3
50
i fear from what i've heard gpt-5.2 will not be significantly better than gemini 3 pro sama better keep that code red going 🥀
20
4
205
New Codex PR seems to indicate GPT-5.2 is dropping very soon 👀 "robin" strings have been updated to "gpt-5.2" https://t.co/GuOmQb7IIp
3
6
142