InclusionAI @TheInclusionAI tweet - We are excited to share a new milestone — we've open-sourced dInfer, a high-performance inference framework for diffusion language models (dLLMs). 🚀10.7X speedup over NVIDIA’s diffusion model framework Fast-dLLM. 🧠1,011 tokens per second in single-batch inference — on the https://t.co/glp0Qi7Ngt

InclusionAI

@TheInclusionAI

10 days

We are excited to share a new milestone — we've open-sourced dInfer, a high-performance inference framework for diffusion language models (dLLMs). 🚀10.7X speedup over NVIDIA’s diffusion model framework Fast-dLLM. 🧠1,011 tokens per second in single-batch inference — on the

255

Replies

OpenBMB

@OpenBMB

9 days

@TheInclusionAI The 1,011 tokens/second on HumanEval is remarkable code generation speed. What's really fascinating is how dInfer manages to outperform even highly optimized autoregressive frameworks in single-batch scenarios - this challenges some conventional wisdom about model architectures

InclusionAI

@TheInclusionAI

9 days

@OpenBMB sharp insight👍

Kalyan KS

@kalyan_kpl

10 days

@TheInclusionAI Congrats.

InclusionAI

@TheInclusionAI

9 days

@kalyan_kpl Thanks.

CME Group

@CMEGroup

1 month

Drive your trading strategy forward with CME Group.

206

alphaXiv

@askalphaxiv

1 day

Tiny Recursive Models: A tiny 7M parameter model that recursively refines its answer beats LLMs 100x larger on hard puzzles like ARC-AGI We independently reproduced the paper, corroborated results, and released the weights + API access for those looking to benchmark it 🔍

584

Nathan Barry

@nathanbarrydev

1 day

Playing around with training a tiny 11M parameter character-level text diffusion model! It's a WIP but the code is currently a heavily modified nanochat gpt implementation (to change from autoregressive decoding to diffusion) and trained on the Tiny Shakespeare dataset. The

143

Zhe Gan

@zhegan4

18 hours

🎁🎁 We release Pico-Banana-400K, a large-scale, high-quality image editing dataset distilled from Nana-Banana across 35 editing types. 🔗 Data link: https://t.co/mi06ddf3mN 🔗Paper link: https://t.co/AaZM02xcJr It includes 258K single-turn image editing data, 72K multi-turn

384

Harveen Singh Chadha

@HarveenChadha

1 day

a bunch of OCR models released in past few weeks: ~ deepseek-ocr-3b ~ olmo-ocr-2-7b ~ chandra-ocr-8b ~ nanonets-ocr2-3b ~ paddleocr-vl-0.9B ~ qwen3-vl-dense/moe (general vlm) ~ dots.ocr-3b Will be dropping a detailed comparison soon

109