Casper Hansen @casper_hansen_ X Profile

Casper Hansen

@casper_hansen_

Followers

8K

Following

2K

Media

398

Statuses

4K

NLP Scientist | AutoAWQ Creator | Open-Source Contributor

Joined August 2019

Don't wanna be here? Send us removal request.

Casper Hansen

@casper_hansen_

3 months

2.1k stars, 2+ million downloads, and 7000+ models on Huggingface later, and I am officially ready to retire my long-time project AutoAWQ ⚡️. Proud to say that AutoAWQ has been adopted by the @vllm_project and will now be maintained by 55+ contributors 🥳

8

12

140

Casper Hansen

@casper_hansen_

9 hours

Kimi K2 did what we know works. Muon and WSD, DeepSeek V3 arch, and a bunch of tokens. It’s that simple. No need to change your data mixture or the expert routing mechanism while training (*hrhr* Llama 4).

6

5

117

Casper Hansen

@casper_hansen_

13 hours

Prior to Kimi K2 release, I beta tested with Claude Code. Here is what it cost me:.- $3.43 for 816 requests. Noticably, output tokens is low, suggesting they are not breaking the bank. What if you used Claude 4 Opus? My bet is $100-200.

8

187

Casper Hansen

@casper_hansen_

13 hours

Can someone do a dense Kimi 2 32B by finding the optimal experts?.

1

0

9

Casper Hansen

@casper_hansen_

22 hours

too much control over training invites constant tweaking. lock in your best guess for data + config, then commit. cancel the run if there are signs of failure.

0

1

Casper Hansen

@casper_hansen_

22 hours

It’s beautiful how the Huggingface thesis has been playing out for a while. They own the distribution center of the world of AI. Your model does not exist to most people if not on Huggingface.

1

21

Casper Hansen

@casper_hansen_

23 hours

Predicted this a bit of time ago. The signs were there. Reasoning model, DeepResearch model, and now the most powerful open-source model.

Casper Hansen

@casper_hansen_

21 days

S-tier AI Labs (Chinese):.- DeepSeek (R1).- Bytedance Seed (veRL, Seedance).- Moonshot AI (Kimi 1.5 ≈ old R1, Kimi-Researcher).

0

3

Casper Hansen

@casper_hansen_

1 day

Oh no, it's over, and I'm banned™️

0

Casper Hansen

@casper_hansen_

1 day

The only reason Kimi K2 is not on OpenRouter is that Chutes has not hosted it yet. Actually amazing lol.

7

2

51

Casper Hansen

@casper_hansen_

2 days

Meta offers being sent out as we speak😭.

Zephyr

@zephyr_z9

2 days

BTW, the founder used to work at Meta.

0

4

Casper Hansen

@casper_hansen_

2 days

imagine when they add o3 type of reasoning to this. it will be so over, instant sota.

Kimi.ai

@Kimi_Moonshot

2 days

🚀 Hello, Kimi K2! Open-Source Agentic Model!.🔹 1T total / 32B active MoE model.🔹 SOTA on SWE Bench Verified, Tau2 & AceBench among open models.🔹Strong in coding and agentic tasks.🐤 Multimodal & thought-mode not supported for now. With Kimi K2, advanced agentic intelligence

4

0

25

Casper Hansen

@casper_hansen_

2 days

The AMD comeback is insane, partially carried by SGLang, vLLM, and even xAI. It will be interesting to see what happens with MI400.

1

0

15

Casper Hansen

@casper_hansen_

2 days

I have been trying Kimi K2 for the last few days. It’s trained to be a Claude Sonnet competitor and a drop-in replacement in Claude Code. It’s a DeepSeek V3 architecture with 1T parameters 32B active!.

1

0

18

Casper Hansen

@casper_hansen_

2 days

Kimi K2 - Big model smell👀

0

1

36

Casper Hansen

@casper_hansen_

2 days

All you need to try Grok 4 is LMArena. Available for free and in direct chat. I wish this was available for free in the X or Grok app

0

5

Casper Hansen

@casper_hansen_

2 days

"error happened, factory reset your phone to continue". guess it will be more than 15 minutes. .

0

1

Casper Hansen

@casper_hansen_

2 days

i need uv but for iphone. ain't no way it should take 15 minutes to setup.

2

0

5

Casper Hansen

@casper_hansen_

2 days

I have an arcane iPhone XS and just bought an iPhone 16. I wonder if I will feel a difference when I pick it up.

0

4

Casper Hansen

@casper_hansen_

2 days

Releasing data, models, and code openly and FIRST is the only right way. Then you can always write a paper with more details on what you did. This is the open way.

0

5

Casper Hansen

@casper_hansen_

2 days

>Meet Dan Hendrycks.>Creates MMLU.>Labs adopt it.>Labs ace it.>Creates Humanity's Last Exam.>The final closed-ended academic benchmark. That lasted about until Grok 4 was released.

0

7

Casper Hansen

@casper_hansen_

3 days

0

1