arena Profile Banner
lmarena.ai Profile
lmarena.ai

@arena

Followers
117K
Following
2K
Media
978
Statuses
2K

LMArena: Open Platform for Community-driven AI Benchmarking. Graduated from UC Berkeley / lmsysorg. We’re hiring: https://t.co/1OkfLq1Pba

US
Joined March 2023
Don't wanna be here? Send us removal request.
@arena
lmarena.ai
1 month
🚀Introducing Code Arena: the next generation of live coding evals for frontier AI models. Built to test how models plan, scaffold, debug, and build real web apps step-by-step. Try Claude, GPT-5, GLM-4.6 and Gemini in Code Arena today!
72
132
1K
@arena
lmarena.ai
15 hours
See how it performs in the Code Arena: the next generation of live coding evals for frontier AI models. Built to test how models plan, scaffold, debug, and build real web apps step-by-step. Head over to:
Tweet card summary image
lmarena.ai
Code Arena lets developers compare how top LLMs build apps, websites, games, and more. Watch AI models code, refine, and deploy real software live.
3
1
26
@arena
lmarena.ai
15 hours
🚨MiniMax M2.1 by @MiniMax__AI is in the Code Arena! Test it out and evaluate it directly with other frontier AI. Don't forget to cast your votes in Battle Mode. We’ll publish the results soon. 🗳️
@MiniMax__AI
MiniMax (official)
3 days
You can apply for M2.1 Early Access here:
3
11
135
@arena
lmarena.ai
3 days
Test out Olmo-3.1-32B-Think vs. all the best Text AI models at: at
Tweet card summary image
lmarena.ai
An open platform for evaluating AI through human preference
1
0
11
@arena
lmarena.ai
3 days
🚨New Model Update @Allen_AI’s Olmo-3.1-32B-Think is now available in the Text Arena! This open model is designed to perform strongly on reasoning, instruction following, and research-focused tasks. Bring your toughest prompts and see how it compares as community votes roll
@allen_ai
Ai2
10 days
Olmo 3.1 is here. We extended our strongest RL run and scaled our instruct recipe to 32B—releasing Olmo 3.1 Think 32B & Olmo 3.1 Instruct 32B, our most capable models yet. 🧵
8
9
83
@arena
lmarena.ai
4 days
Check out GPT-5.2 vs. all the other top frontier models at:
Tweet card summary image
lmarena.ai
An open platform for evaluating AI through human preference
2
0
17
@arena
lmarena.ai
4 days
🚨BREAKING: Text Leaderboard Updates GPT-5.2 enters the Text leaderboard at #17, with a score of 1439. Compared to GPT-5.1, the model has improved by +2 points. It trails just one point behind GPT-5.2-high, which is optimized for expert-level reasoning and critical tasks.
@OpenAI
OpenAI
11 days
GPT-5.2 is now rolling out to everyone. https://t.co/nfubPwnIIw
25
19
205
@arena
lmarena.ai
4 days
Cast your vote and follow how GPT-5.2-Search and Grok-4.1-Fast-Search perform among the leading frontier AI search models:
Tweet card summary image
lmarena.ai
An open platform for evaluating AI through human preference
0
0
9
@arena
lmarena.ai
4 days
🚨🌐Search Leaderboard Update @OpenAI’s GPT-5.2-Search and @xAI’s Grok-4.1-Fast-Search have landed on the Search Arena leaderboard. 🔹GPT-5.2-Search ranks #2 (score 1211) 🔹Grok-4.1-Fast-Search ranks #4 (score 1185) Both models debuted ahead of their predecessors, posting
9
15
179
@arena
lmarena.ai
4 days
Check out Reve V1.1 and its fast variant vs. all the top image AI models at:
Tweet card summary image
lmarena.ai
An open platform for evaluating AI through human preference
0
0
4
@arena
lmarena.ai
4 days
🚨🖼️Image Edit Leaderboard Update Image AI progress isn’t slowing down. Reve V1.1 and its fast variant have entered the Image Edit leaderboard, both securing strong positions in the top 15. 🔸#8 Reve V1.1 (score of 1255) 🔸#15 Reve V1.1 fast (score of 1226) This represents a
@reve
Reve
4 days
Reve V1.1 and Reve V1.1 Fast are here. We’ve fine tuned our Reve editing model by changing the configuration based on what you love most.
1
2
47
@arena
lmarena.ai
4 days
Arena-Rank is Apache-2.0, pip-installable, and open to contributions. Star and clone the repo: https://t.co/zsxFD0d8gg We’re excited to see what the community builds—and what we can learn together.
Tweet card summary image
github.com
Source Code of LMArena Leaderboard Methodology. Contribute to lmarena/arena-rank development by creating an account on GitHub.
1
0
19
@arena
lmarena.ai
4 days
🔓📊We’re open-sourcing the code that powers the LMArena leaderboards. Today we’re releasing Arena-Rank, a Python package for paired-comparison ranking—so anyone can audit how our leaderboards are computed or use it to build their own. Arena-Rank implements: 🔹Bradley–Terry &
6
13
99
@arena
lmarena.ai
4 days
Every vote moves the leaderboard. Check out image AI models at:
Tweet card summary image
lmarena.ai
An open platform for evaluating AI through human preference
0
0
18
@arena
lmarena.ai
4 days
“Create a square image of an open cookbook chapter, describing in detail how to make a Christmas turkey”
3
1
37
@arena
lmarena.ai
4 days
“Take this photo of Einstein and colourise it, add sunglasses and change his coat into an elaborate renaissance outfit”
1
0
31
@arena
lmarena.ai
4 days
“A portrait photo of Albert Einstein, he is wearing a Princeton University wool cardigan - On his shoulder is a luna moth - In his hand he is holding a single sunflower - Behind him there is a poster for the movie "Metropolis" and a Charizard poster - On a pedestal there is a
1
0
33
@arena
lmarena.ai
4 days
“Create a polaroid photograph of a historic image of Richard Feynman giving a physics lecture, with a hand written note on the photo”
1
2
40
@arena
lmarena.ai
4 days
“Create a candid 90s photo with a flash of teenagers at a house party”
1
1
43
@arena
lmarena.ai
4 days
“Create an image of flashy marketing campaign for decaf espresso aimed at sophisticated toddlers”
1
0
35