George Cameron @grmcameron X Profile

George Cameron

@grmcameron

Followers

406

Following

679

Media

7

Statuses

211

Co-Founder @ArtificialAnlys | Message me to play 🎾 in SF

San Francisco

Joined January 2022

Don't wanna be here? Send us removal request.

George Cameron

@grmcameron

19 days

Which models believe the death penalty can be a just punishment? o3 and Grok 3 do, others don't . I created a MicroEval to understand how models will respond to controversial questions including relating to political, ethical and social topics. Link in the tweet below to read

2

6

29

George Cameron

@grmcameron

17 hours

RT @ArtificialAnlys: Tencent’s latest open weights model Hunyuan-A13B (80B total, 13B active) achieves an Artificial Analysis Intelligence….

0

29

0

George Cameron

@grmcameron

21 hours

RT @ArtificialAnlys: We are hiring! We’re looking for engineers and researchers who want to build the standard for how the world evaluates….

0

4

0

George Cameron

@grmcameron

5 days

RT @ArtificialAnlys: OpenAI's new Deep Research API costs up to ~$30 per API call! These new Deep Research API endpoints might just be the….

0

33

0

George Cameron

@grmcameron

11 days

These models can solve AIME level maths problems and people wonder why post-training is now the focus of every lab 🤷‍♂️.

0

1

George Cameron

@grmcameron

12 days

RT @ArtificialAnlys: How do the personalities of the frontier models compare? We had o3 describe their personalities based on responses to….

0

25

0

George Cameron

@grmcameron

13 days

RT @danielhanchen: Excited to see you all tomorrow for our Google Gemma & Unsloth developer meetup! 🦥. We'll be having @Grmcameron from @Ar….

0

3

0

George Cameron

@grmcameron

19 days

Link to responses:

0

3

George Cameron

@grmcameron

20 days

RT @clefourrier: Fun vibe checks prompt (and results) collection!.

0

1

0

George Cameron

@grmcameron

21 days

RT @ArtificialAnlys: Takeaways from MicroEvals - Our new feature to easily 'vibe check' models. 1. DeepSeek R1 gets straight to the point:….

0

10

0

George Cameron

@grmcameron

21 days

RT @ArtificialAnlys: Announcing MicroEvals🧩: the fastest way to vibe check models your use case. Every time we benchmark a model, we want t….

0

34

0

George Cameron

@grmcameron

27 days

RT @ArtificialAnlys: Announcing Hardware Benchmarking on Artificial Analysis! We benchmark NVIDIA H100, H200 and B200 systems to analyze th….

0

28

0

George Cameron

@grmcameron

1 month

Are we back?.

Artificial Analysis

@ArtificialAnlys

1 month

Google’s updated Gemini 2.5 Pro now leads the AI intelligence frontier, matching OpenAI's o3 in our independent benchmarks. Google’s May update of Gemini 2.5 Pro regressed in some performance evaluations compared to the initial March release. This June update not only fixes

0

1

George Cameron

@grmcameron

2 months

RT @ArtificialAnlys: Launching our latest quarterly Artificial Analysis State of AI Report: Our analysis of the key trends shaping AI. A hi….

0

46

0

George Cameron

@grmcameron

2 months

People should pay attention to this chart if forecasting demand for accelerators (GPU or TPU).

Artificial Analysis

@ArtificialAnlys

2 months

Google’s Gemini 2.5 Flash costs 150x more than Gemini 2.0 Flash to run Artificial Analysis Intelligence Index. The increase is driven by:.➤ 9x more expensive output tokens - $3.5 per million with reasoning on ($0.6 with reasoning off) vs $0.4 for Gemini 2.0 Flash.➤ 17x higher

0

2

George Cameron

@grmcameron

3 months

'Did you spend all our money benchmarking reasoning models?'

Artificial Analysis

@ArtificialAnlys

3 months

"All told, Artificial Analysis has spent roughly $5,200 evaluating around a dozen reasoning models, close to twice the amount the firm spent analyzing over 80 non-reasoning models ($2,400).". The cost of delivering our Intelligence Index charts! 💸.

0

2

George Cameron

@grmcameron

3 months

RT @ArtificialAnlys: Llama 4 independent evals: Maverick (402B total, 17B active) beats Claude 3.7 Sonnet, trails DeepSeek V3 but more effi….

0

92

0

George Cameron

@grmcameron

3 months

RT @ArtificialAnlys: Launching our 2025 State of AI Survey! Take part to receive the full survey report and win a pair of Meta Ray-Bans 🕶️….

0

7

0

George Cameron

@grmcameron

4 months

RT @ArtificialAnlys: DeepSeek takes the lead: DeepSeek V3-0324 is now the highest scoring non-reasoning model. This is the first time an op….

0

652

0

George Cameron

@grmcameron

4 months

RT @ArtificialAnlys: OpenAI has launched two new Speech to Text models with impressive accuracy gains! We compare them to Whisper & other m….

0

21

0