lmarena.ai @arena X Profile

lmarena.ai

@arena

Followers

117K

Following

2K

Media

978

Statuses

2K

LMArena: Open Platform for Community-driven AI Benchmarking. Graduated from UC Berkeley / lmsysorg. We’re hiring: https://t.co/1OkfLq1Pba

https://t.co/G4Phmexqmk

US

Joined March 2023

Don't wanna be here? Send us removal request.

lmarena.ai

@arena

1 month

🚀Introducing Code Arena: the next generation of live coding evals for frontier AI models. Built to test how models plan, scaffold, debug, and build real web apps step-by-step. Try Claude, GPT-5, GLM-4.6 and Gemini in Code Arena today!

72

132

1K

lmarena.ai

@arena

15 hours

See how it performs in the Code Arena: the next generation of live coding evals for frontier AI models. Built to test how models plan, scaffold, debug, and build real web apps step-by-step. Head over to:

lmarena.ai

Code Arena lets developers compare how top LLMs build apps, websites, games, and more. Watch AI models code, refine, and deploy real software live.

3

1

26

lmarena.ai

@arena

15 hours

🚨MiniMax M2.1 by @MiniMax__AI is in the Code Arena! Test it out and evaluate it directly with other frontier AI. Don't forget to cast your votes in Battle Mode. We’ll publish the results soon. 🗳️

MiniMax (official)

@MiniMax__AI

3 days

You can apply for M2.1 Early Access here:

3

11

135

lmarena.ai

@arena

3 days

Test out Olmo-3.1-32B-Think vs. all the best Text AI models at: at

lmarena.ai

An open platform for evaluating AI through human preference

1

0

11

lmarena.ai

@arena

3 days

🚨New Model Update @Allen_AI’s Olmo-3.1-32B-Think is now available in the Text Arena! This open model is designed to perform strongly on reasoning, instruction following, and research-focused tasks. Bring your toughest prompts and see how it compares as community votes roll

Ai2

@allen_ai

10 days

Olmo 3.1 is here. We extended our strongest RL run and scaled our instruct recipe to 32B—releasing Olmo 3.1 Think 32B & Olmo 3.1 Instruct 32B, our most capable models yet. 🧵

8

9

83

lmarena.ai

@arena

4 days

Check out GPT-5.2 vs. all the other top frontier models at:

lmarena.ai

An open platform for evaluating AI through human preference

2

0

17

lmarena.ai

@arena

4 days

🚨BREAKING: Text Leaderboard Updates GPT-5.2 enters the Text leaderboard at #17, with a score of 1439. Compared to GPT-5.1, the model has improved by +2 points. It trails just one point behind GPT-5.2-high, which is optimized for expert-level reasoning and critical tasks.

OpenAI

@OpenAI

11 days

GPT-5.2 is now rolling out to everyone. https://t.co/nfubPwnIIw

25

19

205

lmarena.ai

@arena

4 days

Cast your vote and follow how GPT-5.2-Search and Grok-4.1-Fast-Search perform among the leading frontier AI search models:

lmarena.ai

An open platform for evaluating AI through human preference

0

9

lmarena.ai

@arena

4 days

🚨🌐Search Leaderboard Update @OpenAI’s GPT-5.2-Search and @xAI’s Grok-4.1-Fast-Search have landed on the Search Arena leaderboard. 🔹GPT-5.2-Search ranks #2 (score 1211) 🔹Grok-4.1-Fast-Search ranks #4 (score 1185) Both models debuted ahead of their predecessors, posting

9

15

179

lmarena.ai

@arena

4 days

Check out Reve V1.1 and its fast variant vs. all the top image AI models at:

lmarena.ai

An open platform for evaluating AI through human preference

0

4

lmarena.ai

@arena

4 days

🚨🖼️Image Edit Leaderboard Update Image AI progress isn’t slowing down. Reve V1.1 and its fast variant have entered the Image Edit leaderboard, both securing strong positions in the top 15. 🔸#8 Reve V1.1 (score of 1255) 🔸#15 Reve V1.1 fast (score of 1226) This represents a

Reve

@reve

4 days

Reve V1.1 and Reve V1.1 Fast are here. We’ve fine tuned our Reve editing model by changing the configuration based on what you love most.

1

2

47

lmarena.ai

@arena

4 days

Read more in our blog: https://t.co/Lro9tO76bj To install → pip install arena-rank

news.lmarena.ai

At LMArena, we believe transparency is paramount in AI evaluations. With that in focus, we’re delighted to publish Arena-Rank, an open-source Python package for ranking that powers the LMArena...

0

10

lmarena.ai

@arena

4 days

Arena-Rank is Apache-2.0, pip-installable, and open to contributions. Star and clone the repo: https://t.co/zsxFD0d8gg We’re excited to see what the community builds—and what we can learn together.

github.com

Source Code of LMArena Leaderboard Methodology. Contribute to lmarena/arena-rank development by creating an account on GitHub.

1

0

19

lmarena.ai

@arena

4 days

🔓📊We’re open-sourcing the code that powers the LMArena leaderboards. Today we’re releasing Arena-Rank, a Python package for paired-comparison ranking—so anyone can audit how our leaderboards are computed or use it to build their own. Arena-Rank implements: 🔹Bradley–Terry &

6

13

99

lmarena.ai

@arena

4 days

Every vote moves the leaderboard. Check out image AI models at:

lmarena.ai

An open platform for evaluating AI through human preference

0

18

lmarena.ai

@arena

4 days

“Create a square image of an open cookbook chapter, describing in detail how to make a Christmas turkey”

3

1

37

lmarena.ai

@arena

4 days

“Take this photo of Einstein and colourise it, add sunglasses and change his coat into an elaborate renaissance outfit”

1

0

31

lmarena.ai

@arena

4 days

“A portrait photo of Albert Einstein, he is wearing a Princeton University wool cardigan - On his shoulder is a luna moth - In his hand he is holding a single sunflower - Behind him there is a poster for the movie "Metropolis" and a Charizard poster - On a pedestal there is a

1

0

33

lmarena.ai

@arena

4 days

“Create a polaroid photograph of a historic image of Richard Feynman giving a physics lecture, with a hand written note on the photo”

1

2

40

lmarena.ai

@arena

4 days

“Create a candid 90s photo with a flash of teenagers at a house party”

1

43

lmarena.ai

@arena

4 days

“Create an image of flashy marketing campaign for decaf espresso aimed at sophisticated toddlers”

1

0

35