
George Cameron
@grmcameron
Followers
406
Following
679
Media
7
Statuses
211
Co-Founder @ArtificialAnlys | Message me to play 🎾 in SF
San Francisco
Joined January 2022
RT @ArtificialAnlys: Tencent’s latest open weights model Hunyuan-A13B (80B total, 13B active) achieves an Artificial Analysis Intelligence….
0
29
0
RT @ArtificialAnlys: We are hiring! We’re looking for engineers and researchers who want to build the standard for how the world evaluates….
0
4
0
RT @ArtificialAnlys: OpenAI's new Deep Research API costs up to ~$30 per API call! These new Deep Research API endpoints might just be the….
0
33
0
RT @ArtificialAnlys: How do the personalities of the frontier models compare? We had o3 describe their personalities based on responses to….
0
25
0
RT @danielhanchen: Excited to see you all tomorrow for our Google Gemma & Unsloth developer meetup! 🦥. We'll be having @Grmcameron from @Ar….
0
3
0
RT @ArtificialAnlys: Takeaways from MicroEvals - Our new feature to easily 'vibe check' models. 1. DeepSeek R1 gets straight to the point:….
0
10
0
RT @ArtificialAnlys: Announcing MicroEvals🧩: the fastest way to vibe check models your use case. Every time we benchmark a model, we want t….
0
34
0
RT @ArtificialAnlys: Announcing Hardware Benchmarking on Artificial Analysis! We benchmark NVIDIA H100, H200 and B200 systems to analyze th….
0
28
0
RT @ArtificialAnlys: Launching our latest quarterly Artificial Analysis State of AI Report: Our analysis of the key trends shaping AI. A hi….
0
46
0
People should pay attention to this chart if forecasting demand for accelerators (GPU or TPU).
Google’s Gemini 2.5 Flash costs 150x more than Gemini 2.0 Flash to run Artificial Analysis Intelligence Index. The increase is driven by:.➤ 9x more expensive output tokens - $3.5 per million with reasoning on ($0.6 with reasoning off) vs $0.4 for Gemini 2.0 Flash.➤ 17x higher
0
0
2
'Did you spend all our money benchmarking reasoning models?'
"All told, Artificial Analysis has spent roughly $5,200 evaluating around a dozen reasoning models, close to twice the amount the firm spent analyzing over 80 non-reasoning models ($2,400).". The cost of delivering our Intelligence Index charts! đź’¸.
0
0
2
RT @ArtificialAnlys: Llama 4 independent evals: Maverick (402B total, 17B active) beats Claude 3.7 Sonnet, trails DeepSeek V3 but more effi….
0
92
0
RT @ArtificialAnlys: Launching our 2025 State of AI Survey! Take part to receive the full survey report and win a pair of Meta Ray-Bans 🕶️….
0
7
0
RT @ArtificialAnlys: DeepSeek takes the lead: DeepSeek V3-0324 is now the highest scoring non-reasoning model. This is the first time an op….
0
652
0
RT @ArtificialAnlys: OpenAI has launched two new Speech to Text models with impressive accuracy gains! We compare them to Whisper & other m….
0
21
0