arjunkocher Profile Banner
Arjun Profile
Arjun

@arjunkocher

Followers
4K
Following
4K
Media
461
Statuses
6K

AIR𝘦𝘴𝘦𝘢𝘳𝘤𝘩

Joined September 2009
Don't wanna be here? Send us removal request.
@arjunkocher
Arjun
10 days
RT @Kimi_Moonshot: 🆕 Say hello to kimi-k2-turbo-preview.Same model. Same context. NOW 4× FASTER. ⚡️ From 10 tok/s to 40 tok/s. 💰 Limited-….
0
179
0
@arjunkocher
Arjun
14 days
RT @teortaxesTex: Zai is moving further from DeepSeek recipe. Stable Muon, 23T tokens, deeper, thinner, more computationally intense. We'll….
0
18
0
@arjunkocher
Arjun
16 days
there are humans. then there are Noam Shazeers.
@eliebakouch
elie
17 days
Noam Shazeer 2020 paper with no equation, just pseudo code with einsum
Tweet media one
0
2
9
@arjunkocher
Arjun
21 days
Much awaited Kimi-K2 Technical Report is Out Now!. Kimi K2 is 1T-parameter open-weight MoE model built for agentic intelligence. Using MuonClip optimizer and a 15.5T-token high-quality dataset, Kimi K2 achieves stable, scalable pre-training. Post-training combines large-scale
Tweet media one
Tweet media two
1
2
34
@arjunkocher
Arjun
26 days
0
0
2
@arjunkocher
Arjun
26 days
Mixture of Raytraced Experts. —. - MRE replaces fixed top-k MoE gating with a dynamic, stochastic raytracing mechanism. - firing ray probabilistically activates a sequence of experts using a routing net, like a poisson walk on a softmax graph. - no load balancing, hard top-k,
Tweet media one
1
2
37
@arjunkocher
Arjun
27 days
Tweet card summary image
huggingface.co
1
0
3
@arjunkocher
Arjun
27 days
Nous released their Hermes 3 dataset. --.- 1m samples.- uncensored sota for its time across llama-3 (8b, 70b, 405b).- dense in-prompt adherence, roleplay, subjective/objective tasks.- rich tool use, structured outputs, api-like call patterns.- early agentic traces: xml-tagged
Tweet media one
2
3
38
@arjunkocher
Arjun
1 month
RT @teortaxesTex: Oh.
Tweet media one
0
2
0
@arjunkocher
Arjun
1 month
@Kimi_Moonshot
Kimi.ai
1 month
🚀 Hello, Kimi K2! Open-Source Agentic Model!.🔹 1T total / 32B active MoE model.🔹 SOTA on SWE Bench Verified, Tau2 & AceBench among open models.🔹Strong in coding and agentic tasks.🐤 Multimodal & thought-mode not supported for now. With Kimi K2, advanced agentic intelligence
Tweet media one
0
0
1
@arjunkocher
Arjun
1 month
The 1 Trillion param Open-Source Agentic Model from Kimi Moonshot is live. Kimi K-2 ✌🏻. (based on MuonClip Optimizer). get started here:
Tweet media one
Tweet media two
1
0
16
@arjunkocher
Arjun
1 month
RT @teortaxesTex: Claim. Source: “revealed in a dream”. I do sometimes get leaks in dreams, and Arjun is Indian so maybe it's a Ramanujan s….
0
1
0
@arjunkocher
Arjun
1 month
Kimi’s next drop gonna be spicy 🥵
Tweet media one
1
0
24
@arjunkocher
Arjun
1 month
0
0
3
@arjunkocher
Arjun
1 month
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning.via Multi-Agent Multi-Turn Reinforcement Learning. —.adversarial selfplay can yield cognitive dividends in LLMs trained for general reasoning without explicit labels or human reward shaping. instead of finetuning on
Tweet media one
Tweet media two
1
3
38
@arjunkocher
Arjun
1 month
0
0
1
@arjunkocher
Arjun
1 month
Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search. —.AB‑MCTS (Adaptive Branching Monte Carlo Tree Search) inference-time LLMs strategy. repeated sampling w iterative refinement, guided by external feedback. each node uses bayesian posterior
Tweet media one
2
1
17