Satoshi Matsuoka @ProfMatsuoka X Profile

Satoshi Matsuoka

@ProfMatsuoka

Followers

25K

Following

21K

Media

1K

Statuses

44K

理研計算科学研究センター長 Director RIKEN R-CCS, 東科大特定教授 Prof. Inst. Sci.. ACM/ISC/JSSST/IPSJ Fellows, IEEE Fernbach(2014)&Cray(2022) Awards, 令4紫綬褒章 Purple Ribbon Medal 2022

https://t.co/SU32YCu91F

神戸市 Kobe, Japan

Joined July 2009

Don't wanna be here? Send us removal request.

Satoshi Matsuoka

@ProfMatsuoka

5 hours

Registraion for the SC/HPC Asia 2026 @ Osaka is open. Register now! https://t.co/ZbVLZB6mXS It will become the biggest HPC/AI/Quantum research&industry event in Asia, at the scale of EU ISC, with over 100 global exhibitors, 36 papers/101 submitted, 28 workshops, 20 tutorials,

1

6

12

Satoshi Matsuoka

@ProfMatsuoka

5 hours

So register now, and if possible reserve the nearyby hotel including the adjacent Rihga Royal Hotel Osaka https://t.co/W5SbnTxS0b. The room rates are still reasonable, and capacity 1000 rooms but could sell out fast with massive influx tourism.

rihga.com

The RIHGA Royal Hotel Osaka Vignette Collection, located in Nakanoshima, the heart of Osaka. Directly connected to KEIHAN Railway Nakanoshima Station. Free shuttle bus service operates from JR Osaka...

0

1

Satoshi Matsuoka

@ProfMatsuoka

5 hours

The dates are Jan 26-29 2026. We have kept the registration rates low, 50K JPY (~330USD) for Full ACM Member reg., and 10K JPY for Students or Exhibits&Keynote (6 keynotes / 3 days) only reg. for early birds.

1

0

1

Torsten Hoefler 🇨🇭

@thoefler

23 hours

Just arrived at the ADIA Lab symposium in Abu Dhabi to listen to Horst Simon's introduction and Bjorn Stevens' keynote on how to compute the future climate! Featuring our Gordon Bell finalists 🌍🚀 Looking forward to speculating about how to create an #AI climate scientist 😀.

0

2

12

TuringPost

@TheTuringPost

3 days

A must-read survey: LLM-empowered knowledge graph construction Connects traditional KG methods with modern LLM-driven techniques, covering: - KG foundations: ontology, extraction, fusion - LLM-enhanced ontology: top-down & bottom-up - LLM-driven extraction: schema-based &

27

113

498

Unwind AI

@unwind_ai_

4 days

ByteDance just dropped an OCR model that reads documents just like humans. This 0.3B model analyzes page layout first, then parses elements in parallel. 100% open-source.

22

186

2K

Brian Roemmele

@BrianRoemmele

7 days

IT FREAKING WORKED! At 4am today I just proved DeepSeek-OCR AI can scan an entire microfiche sheet and not just cells and retain 100% of the data in seconds… AND Have a full understanding of the text/complex drawings and their context. I just changed offline data curation!

Brian Roemmele

@BrianRoemmele

8 days

BOOOOOOOM! CHINA DEEPSEEK DOES IT AGAIN! An entire encyclopedia compressed into a single, high-resolution image! — A mind-blowing breakthrough. DeepSeek-OCR, unleashed an electrifying 3-billion-parameter vision-language model that obliterates the boundaries between text and

430

879

8K

Avi Chawla

@_avichawla

5 days

Fine-tuning LLM Agents without Fine-tuning LLMs! Imagine improving your AI agent's performance from experience without ever touching the model weights. It's just like how humans remember past episodes and learn from them. That's precisely what Memento does. The core concept:

38

207

1K

James Eagle

@JamesEagle17

3 months

I had to triple-check this. It's common knowledge that DeepSeek and Chinese AI is catching up with the US. But 5 years of supercomputer data show that the opposite is actually happening. In fact, the US is smashing it out the park.

25

83

280

Austin Lyons

@theaustinlyons

6 days

$CRDO example setup for $NVDA Rubin NVL144. Lots of opportunity with these high-density racks. 🧵

5

39

209

Robert Youssef

@rryssf_

6 days

🚨 Holy shit...Meta just rewrote how Transformers think. They built something called The Free Transformer and it breaks the core rule every GPT model has lived by since 2017. For 8 years, Transformers have been blindfolded forced to guess the next token one at a time, no inner

92

333

2K

elvis

@omarsar0

6 days

Scaling RL for Trillion-Scale Thinking Model Scaling RL is hard! But this team might have figured out something. They introduce Ring-1T, a 1T-parameter MoE reasoning model with ~50B params active per token. It’s trained with a long-CoT SFT phase, a verifiable-rewards reasoning

12

67

400

SemiAnalysis

@SemiAnalysis_

6 days

Meta has open sourced their CTran library that natively works with AMD & NVIDIA GPUs 🚀. Previously, if u want multiple NVIDIA GPUs to work together on an workload, you must used the NVIDIA NCCL library. Although NCCL's source code is public, it does not have an open governance

3

43

339

Jessy Lin

@realJessyLin

7 days

🧠 How can we equip LLMs with memory that allows them to continually learn new things? In our new paper with @AIatMeta, we show how sparsely finetuning memory layers enables targeted updates for continual learning, w/ minimal interference with existing knowledge. While full

50

285

2K

alphaXiv

@askalphaxiv

6 days

We used DeepSeek OCR to extract every dataset from tables/charts across 500k+ AI arXiv papers for $1000 🚀 See which benchmarks are trending and discover datasets you didn't know existed Doing the same task with Mistral OCR would've cost $7500 👀

49

331

3K

Rohan Paul

@rohanpaul_ai

7 days

Knowledge Flow show LLMs can push past context limits by carrying a tiny editable knowledge list across attempts. Hits 100% on AIME25 using text only, so test time memory can unlock big gains. This approach achieved a 100% accuracy on AIME 2025 using only open-source models,

Yufan Zhuang

@yufan_zhuang

7 days

Can LLMs reason beyond context limits? 🤔 Introducing Knowledge Flow, a training-free method that helped gpt-oss-120b & Qwen3-235B achieve 100% on the AIME-25, no tools. How? like human deliberation, for LLMs. 📝 Blog: https://t.co/Q6kih5rNXs 💻 Code: https://t.co/CGXMRXVM58

7

31

219

SemiAnalysis

@SemiAnalysis_

8 days

Next-generation backplanes have the potential to use 400G BiDi SerDes for scale-up connectivity, compared to existing backplanes that use UniDi SerDes. To explain in terms of Minecraft redstone: on GB200 NVL72 backplane each direction requires an dedicated line of redstone dust

4

13

144

Millie Marconi

@Yesterday_work_

8 days

🚨 This MIT paper just broke everything we thought we knew about AI reasoning. These researchers built something called Tensor Logic that turns logical reasoning into pure mathematics. Not symbolic manipulation. Not heuristic search. Just tensor algebra. Here's how it works:

114

293

2K

Andrej Karpathy

@karpathy

7 days

I quite like the new DeepSeek-OCR paper. It's a good OCR model (maybe a bit worse than dots), and yes data collection etc., but anyway it doesn't matter. The more interesting part for me (esp as a computer vision at heart who is temporarily masquerading as a natural language

vLLM

@vllm_project

8 days

🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support. 🧠 Compresses visual contexts up to 20× while keeping

558

2K

13K

Casper Hansen

@casper_hansen_

8 days

NEW DeepSeek OCR model that outperforms dots ocr while prefilling 3x less tokens

10

44

460