Arjun @arjunkocher X Profile

Arjun

@arjunkocher

Followers

4K

Following

4K

Media

461

Statuses

6K

AIR𝘦𝘴𝘦𝘢𝘳𝘤𝘩

Joined September 2009

Don't wanna be here? Send us removal request.

Arjun

@arjunkocher

10 days

RT @Kimi_Moonshot: 🆕 Say hello to kimi-k2-turbo-preview.Same model. Same context. NOW 4× FASTER. ⚡️ From 10 tok/s to 40 tok/s. 💰 Limited-….

0

179

0

Arjun

@arjunkocher

14 days

RT @teortaxesTex: Zai is moving further from DeepSeek recipe. Stable Muon, 23T tokens, deeper, thinner, more computationally intense. We'll….

0

18

0

Arjun

@arjunkocher

16 days

there are humans. then there are Noam Shazeers.

elie

@eliebakouch

17 days

Noam Shazeer 2020 paper with no equation, just pseudo code with einsum

0

2

9

Arjun

@arjunkocher

18 days

New multimodal LLM from StepFun just landed. - MoE architecture 321B total params, 38B active. - Open-Sources on 31st July. try here:

stepfun.com

阶跃AI是一个聪明可靠的个人效率助手，可以帮你获取知识、查询信息、学习语言、创意写作、编写代码，在工作、学习、生活等各种场景下帮你解决问题。带你发现和理解世界~

0

1

14

Arjun

@arjunkocher

21 days

read here:.

github.com

Kimi K2 is the large language model series developed by Moonshot AI team - MoonshotAI/Kimi-K2

0

2

Arjun

@arjunkocher

21 days

Much awaited Kimi-K2 Technical Report is Out Now!. Kimi K2 is 1T-parameter open-weight MoE model built for agentic intelligence. Using MuonClip optimizer and a 15.5T-token high-quality dataset, Kimi K2 achieves stable, scalable pre-training. Post-training combines large-scale

1

2

34

Arjun

@arjunkocher

26 days

0

2

Arjun

@arjunkocher

26 days

Mixture of Raytraced Experts. —. - MRE replaces fixed top-k MoE gating with a dynamic, stochastic raytracing mechanism. - firing ray probabilistically activates a sequence of experts using a routing net, like a poisson walk on a softmax graph. - no load balancing, hard top-k,

1

2

37

Arjun

@arjunkocher

27 days

huggingface.co

1

0

3

Arjun

@arjunkocher

27 days

Nous released their Hermes 3 dataset. --.- 1m samples.- uncensored sota for its time across llama-3 (8b, 70b, 405b).- dense in-prompt adherence, roleplay, subjective/objective tasks.- rich tool use, structured outputs, api-like call patterns.- early agentic traces: xml-tagged

2

3

38

Arjun

@arjunkocher

1 month

RT @teortaxesTex: Oh.

0

2

0

Arjun

@arjunkocher

1 month

Kimi.ai

@Kimi_Moonshot

1 month

🚀 Hello, Kimi K2! Open-Source Agentic Model!.🔹 1T total / 32B active MoE model.🔹 SOTA on SWE Bench Verified, Tau2 & AceBench among open models.🔹Strong in coding and agentic tasks.🐤 Multimodal & thought-mode not supported for now. With Kimi K2, advanced agentic intelligence

0

1

Arjun

@arjunkocher

1 month

The 1 Trillion param Open-Source Agentic Model from Kimi Moonshot is live. Kimi K-2 ✌🏻. (based on MuonClip Optimizer). get started here:

1

0

16

Arjun

@arjunkocher

1 month

RT @teortaxesTex: Claim. Source: “revealed in a dream”. I do sometimes get leaks in dreams, and Arjun is Indian so maybe it's a Ramanujan s….

0

1

0

Arjun

@arjunkocher

1 month

Kimi’s next drop gonna be spicy 🥵

1

0

24

Arjun

@arjunkocher

1 month

0

3

Arjun

@arjunkocher

1 month

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning.via Multi-Agent Multi-Turn Reinforcement Learning. —.adversarial selfplay can yield cognitive dividends in LLMs trained for general reasoning without explicit labels or human reward shaping. instead of finetuning on

1

3

38

Arjun

@arjunkocher

1 month

0

1

Arjun

@arjunkocher

1 month

Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search. —.AB‑MCTS (Adaptive Branching Monte Carlo Tree Search) inference-time LLMs strategy. repeated sampling w iterative refinement, guided by external feedback. each node uses bayesian posterior

2

1

17

Arjun

@arjunkocher

1 month

github.com

An unified model that seamlessly integrates multimodal understanding, text-to-image generation, and image editing within a single powerful framework. - AIDC-AI/Ovis-U1

0

1