grmcameron Profile Banner
George Cameron Profile
George Cameron

@grmcameron

Followers
406
Following
679
Media
7
Statuses
211

Co-Founder @ArtificialAnlys | Message me to play 🎾 in SF

San Francisco
Joined January 2022
Don't wanna be here? Send us removal request.
@grmcameron
George Cameron
19 days
Which models believe the death penalty can be a just punishment? o3 and Grok 3 do, others don't . I created a MicroEval to understand how models will respond to controversial questions including relating to political, ethical and social topics. Link in the tweet below to read
Tweet media one
2
6
29
@grmcameron
George Cameron
17 hours
RT @ArtificialAnlys: Tencent’s latest open weights model Hunyuan-A13B (80B total, 13B active) achieves an Artificial Analysis Intelligence….
0
29
0
@grmcameron
George Cameron
21 hours
RT @ArtificialAnlys: We are hiring! We’re looking for engineers and researchers who want to build the standard for how the world evaluates….
0
4
0
@grmcameron
George Cameron
5 days
RT @ArtificialAnlys: OpenAI's new Deep Research API costs up to ~$30 per API call! These new Deep Research API endpoints might just be the….
0
33
0
@grmcameron
George Cameron
11 days
These models can solve AIME level maths problems and people wonder why post-training is now the focus of every lab 🤷‍♂️.
0
0
1
@grmcameron
George Cameron
12 days
RT @ArtificialAnlys: How do the personalities of the frontier models compare? We had o3 describe their personalities based on responses to….
0
25
0
@grmcameron
George Cameron
13 days
RT @danielhanchen: Excited to see you all tomorrow for our Google Gemma & Unsloth developer meetup! 🦥. We'll be having @Grmcameron from @Ar….
0
3
0
@grmcameron
George Cameron
19 days
Link to responses:
0
0
3
@grmcameron
George Cameron
20 days
RT @clefourrier: Fun vibe checks prompt (and results) collection!.
0
1
0
@grmcameron
George Cameron
21 days
RT @ArtificialAnlys: Takeaways from MicroEvals - Our new feature to easily 'vibe check' models. 1. DeepSeek R1 gets straight to the point:….
0
10
0
@grmcameron
George Cameron
21 days
RT @ArtificialAnlys: Announcing MicroEvals🧩: the fastest way to vibe check models your use case. Every time we benchmark a model, we want t….
0
34
0
@grmcameron
George Cameron
27 days
RT @ArtificialAnlys: Announcing Hardware Benchmarking on Artificial Analysis! We benchmark NVIDIA H100, H200 and B200 systems to analyze th….
0
28
0
@grmcameron
George Cameron
1 month
Are we back?.
@ArtificialAnlys
Artificial Analysis
1 month
Google’s updated Gemini 2.5 Pro now leads the AI intelligence frontier, matching OpenAI's o3 in our independent benchmarks. Google’s May update of Gemini 2.5 Pro regressed in some performance evaluations compared to the initial March release. This June update not only fixes
Tweet media one
0
0
1
@grmcameron
George Cameron
2 months
RT @ArtificialAnlys: Launching our latest quarterly Artificial Analysis State of AI Report: Our analysis of the key trends shaping AI. A hi….
0
46
0
@grmcameron
George Cameron
2 months
People should pay attention to this chart if forecasting demand for accelerators (GPU or TPU).
@ArtificialAnlys
Artificial Analysis
2 months
Google’s Gemini 2.5 Flash costs 150x more than Gemini 2.0 Flash to run Artificial Analysis Intelligence Index. The increase is driven by:.➤ 9x more expensive output tokens - $3.5 per million with reasoning on ($0.6 with reasoning off) vs $0.4 for Gemini 2.0 Flash.➤ 17x higher
Tweet media one
0
0
2
@grmcameron
George Cameron
3 months
'Did you spend all our money benchmarking reasoning models?'
Tweet media one
@ArtificialAnlys
Artificial Analysis
3 months
"All told, Artificial Analysis has spent roughly $5,200 evaluating around a dozen reasoning models, close to twice the amount the firm spent analyzing over 80 non-reasoning models ($2,400).". The cost of delivering our Intelligence Index charts! đź’¸.
0
0
2
@grmcameron
George Cameron
3 months
RT @ArtificialAnlys: Llama 4 independent evals: Maverick (402B total, 17B active) beats Claude 3.7 Sonnet, trails DeepSeek V3 but more effi….
0
92
0
@grmcameron
George Cameron
3 months
RT @ArtificialAnlys: Launching our 2025 State of AI Survey! Take part to receive the full survey report and win a pair of Meta Ray-Bans 🕶️….
0
7
0
@grmcameron
George Cameron
4 months
RT @ArtificialAnlys: DeepSeek takes the lead: DeepSeek V3-0324 is now the highest scoring non-reasoning model. This is the first time an op….
0
652
0
@grmcameron
George Cameron
4 months
RT @ArtificialAnlys: OpenAI has launched two new Speech to Text models with impressive accuracy gains! We compare them to Whisper & other m….
0
21
0