tscholak Profile Banner
Torsten Scholak Profile
Torsten Scholak

@tscholak

Followers
2K
Following
51K
Media
175
Statuses
4K

Lead Research Scientist, Foundation Models Lab @ServiceNowRSRCH. Opinions are not that of my employer.

Montréal
Joined February 2010
Don't wanna be here? Send us removal request.
@tscholak
Torsten Scholak
2 months
🚨🤯 Today Jensen Huang announced SLAM Lab's newest model on the @HelloKnowledge stage: Apriel‑Nemotron‑15B‑Thinker 🚨.A lean, mean reasoning machine punching way above its weight class 👊.Built by SLAM × NVIDIA. Smaller models, bigger impact. 🧵👇
2
21
47
@tscholak
Torsten Scholak
10 days
Nice release! Worth noting the MoE x Mamba gives coverage, not multiplicative speed-ups:.* small batch: expert sparsity keeps latency low.* medium-large batch: Mamba's KV-free scan scales while attention would choke.Net: below dense latency across the board, but no compounding.
@tri_dao
Tri Dao
10 days
Crazy that we now have an open source model with 13B params that’s competitive w o1. And Mamba layers help bring much higher inference throughput.
0
0
4
@tscholak
Torsten Scholak
1 month
RT @joanrod_ai: Thanks @_akhaliq for sharing our work! Excited to present our next generation of SVG models, now using Reinforcement Learni….
0
41
0
@tscholak
Torsten Scholak
2 months
RT @PShravannayak: 🚀 Excited to share that UI-Vision has been accepted at ICML 2025! 🎉. We have also released the UI-Vision grounding datas….
0
15
0
@tscholak
Torsten Scholak
2 months
RT @NVIDIAAI: 🚀 Announced at #Knowledge25: @ServiceNow & @nvidia introduce Apriel Nemotron 15B. Apriel Nemotron 15B is a compact, cost-eff….
0
49
0
@tscholak
Torsten Scholak
2 months
RT @ServiceNowNews: Together with @NVIDIA, we're launching a new class of intelligent AI agents. Our Apriel Nemotron 15B model, co-develope….
0
19
0
@tscholak
Torsten Scholak
2 months
Try it, tune it, test it out!.Huge thanks to the entire SLAM Lab, @carnaticfiddle, @nvidia, and everyone who contributed. 🙌.#Apriel #Nemotron #FastLLM #ServiceNow #NVIDIA #LLM #AI.
0
0
5
@tscholak
Torsten Scholak
2 months
🏗️ Built in a 3-stage pipeline:.1️⃣ CPT: 100B+ tokens (math, science, logic, coding).2️⃣ SFT: 200K curated instructions.3️⃣ RL (GRPO): sharp instruction-following & coding.(+ periodic snapshot merges to prevent forgetting).Made possible by Fast-LLM,
1
0
4
@tscholak
Torsten Scholak
2 months
📊 Math & Reasoning Benchmarks:.💥 MATH-500: 91.6 (matches O1-mini).💥 AIME-24: 73.3 (beats Llama-3.1-Nemotron-Nano-8B-v1, LG-ExaOne-32B).💥 AMC23: 95.0 (top-tier).Big reasoning, compact model.
Tweet media one
1
0
4
@tscholak
Torsten Scholak
2 months
📈 Enterprise & Agentic Tasks:.💥 MBPP: 85.8 (competitive w/ QWQ-32B).💥 Enterprise RAG: 69.2 (beats O1-mini, LG-ExaOne-32B).💥 IF Eval: 84.6 (+5 pts vs O1-mini).Built for real-world impact.
Tweet media one
1
0
4
@tscholak
Torsten Scholak
2 months
💾 2× memory efficiency vs larger open models (QWQ-32B, ExaOne-32B).💵 40% fewer tokens consumed per reasoning task.🎓 Competitive academic reasoning (AIME, AMC, MATH-500, GPQA).🔓 MIT license.Now live on Hugging Face:.👉
Tweet media one
1
0
6
@tscholak
Torsten Scholak
2 months
RT @DBahdanau: I am excited to open-source PipelineRL - a scalable async RL implementation with in-flight weight updates. Why wait until yo….
0
113
0
@tscholak
Torsten Scholak
3 months
RT @DBahdanau: AI folks in ServiceNow have been cooking. And they cooked a very delicious small 5B parameter cookie!.
0
2
0
@tscholak
Torsten Scholak
3 months
RT @Dorialexander: There isn’t that many newcomers in the SLM space and this one looks very interesting. MIT base models, new open source p….
0
5
0
@tscholak
Torsten Scholak
3 months
RT @RajeswarSai: Showing off Apriel-5B 🚀, an efficient and effective compact model yet. Congrats to the whole SLAM team led by @tscholak @….
0
1
0
@tscholak
Torsten Scholak
3 months
RT @TheJishnuNair: Some new work from the team!.
0
1
0
@tscholak
Torsten Scholak
3 months
RT @ostap__alex: Exciting release from ServiceNow Research — introducing Apriel-5B, a compact and efficient open-source language model that….
0
2
0
@tscholak
Torsten Scholak
3 months
RT @sebpaquet: This new, speedy and efficient language model arose from a fruitful collaboration between two teams at ServiceNow! Pretrain….
0
1
0
@tscholak
Torsten Scholak
3 months
Huge thanks and congrats to the SLAM team and @ServiceNowRSRCH 🙌❤️.And a special shoutout to @carnaticfiddle, best co-lead anyone could ask for.
0
0
8
@tscholak
Torsten Scholak
3 months
🧠 Researchers: run it.🧰 Engineers: fine-tune it.🧪 Builders: break it.Tell us what you find. Apriel-5B models are permissively licensed (MIT) and ready to chat. #Apriel #LLM #AI #OpenWeights #FastLLM #SLAM #ServiceNow #ServiceNowResearch.
1
0
8