@TheInclusionAI
InclusionAI
10 days
We are excited to share a new milestone โ€” we've open-sourced dInfer, a high-performance inference framework for diffusion language models (dLLMs). ๐Ÿš€10.7X speedup over NVIDIAโ€™s diffusion model framework Fast-dLLM. ๐Ÿง 1,011 tokens per second in single-batch inference โ€” on the
2
37
255

Replies

@OpenBMB
OpenBMB
9 days
@TheInclusionAI The 1,011 tokens/second on HumanEval is remarkable code generation speed. What's really fascinating is how dInfer manages to outperform even highly optimized autoregressive frameworks in single-batch scenarios - this challenges some conventional wisdom about model architectures
1
0
4
@TheInclusionAI
InclusionAI
9 days
@OpenBMB sharp insight๐Ÿ‘
0
0
4
@kalyan_kpl
Kalyan KS
10 days
@TheInclusionAI Congrats.
1
0
1
@TheInclusionAI
InclusionAI
9 days
@kalyan_kpl Thanks.
0
0
1
@CMEGroup
CME Group
1 month
Drive your trading strategy forward with CME Group.
9
41
206
@askalphaxiv
alphaXiv
1 day
Tiny Recursive Models: A tiny 7M parameter model that recursively refines its answer beats LLMs 100x larger on hard puzzles like ARC-AGI We independently reproduced the paper, corroborated results, and released the weights + API access for those looking to benchmark it ๐Ÿ”
21
80
584
@nathanbarrydev
Nathan Barry
1 day
Playing around with training a tiny 11M parameter character-level text diffusion model! It's a WIP but the code is currently a heavily modified nanochat gpt implementation (to change from autoregressive decoding to diffusion) and trained on the Tiny Shakespeare dataset. The
47
143
2K
@zhegan4
Zhe Gan
18 hours
๐ŸŽ๐ŸŽ We release Pico-Banana-400K, a large-scale, high-quality image editing dataset distilled from Nana-Banana across 35 editing types. ๐Ÿ”— Data link: https://t.co/mi06ddf3mN ๐Ÿ”—Paper link: https://t.co/AaZM02xcJr It includes 258K single-turn image editing data, 72K multi-turn
5
57
384
@HarveenChadha
Harveen Singh Chadha
1 day
a bunch of OCR models released in past few weeks: ~ deepseek-ocr-3b ~ olmo-ocr-2-7b ~ chandra-ocr-8b ~ nanonets-ocr2-3b ~ paddleocr-vl-0.9B ~ qwen3-vl-dense/moe (general vlm) ~ dots.ocr-3b Will be dropping a detailed comparison soon
34
109
1K