Benjamin Van Durme
@ben_vandurme
Followers
1K
Following
221
Media
5
Statuses
176
Glad to see mmBERT put to good use. These look like very useful tokens!
We have just released 📄FinePDFs-Edu, a version of FinePDFs filtered with the FineWeb-Edu approach using ModernBERT and mmBERT. 350B+ tokens of top tier mid-training data in multiple languages. You can also download the classifiers (all 69 of them!)
0
1
4
Has the week been unFAIR? I’m sorry to hear it! Did you know Microsoft is hiring in genAI across divisions? Check out their portal for a rich set of listings. Here are some examples: https://t.co/RGeLnr8lS9
https://t.co/FgTQUdmcBp
0
13
48
COLM 2025 was once of the best conferences I can recall ever attending. I heard this also from many in coffee chats and dinners. Thank you to all that put it together. I look forward to year three.
3
10
194
🚀 Thrilled to share our new work: "Always Tell Me The Odds" in COLM25 LLMs struggle with accurate probability predictions, often giving coarse answers. We train decoder-based models to provide fine-grained, calibrated probabilities, significantly outperforming strong baselines!
1
4
14
Summer '26 PhD research internships at Microsoft Copilot Tuning. Continual learning, complex reasoning and retrieval, nl2code, data efficient post-training. https://t.co/HM4cKqEhgW
0
32
238
Large scale hiring of applied research scientists within Microsoft E&D.
0
3
20
XLM-R has been SOTA for 6 years for multilingual encoders. That's an eternity in AI 🤯 Time for an upgrade. Introducing mmBERT: 2-4x faster than previous models ⚡ while even beating o3 and Gemini 2.5 Pro 🔥 + open models & training data - try it now! How did we do it? 🧵
13
65
251
A cool project that I'm happy to see appearing at TACL.
LLMs power research, decision‑making, and exploration—but most benchmarks don’t test how well they stitch together evidence across dozens (or hundreds) of sources. Meet MoNaCo, our new eval for question-answering cross‑source reasoning. 👇
0
2
15
#HopkinsDSAI welcomes 22 new faculty members, who join more than 150 DSAI faculty members across @JohnsHopkins in advancing the study of data science, machine learning, and #AI and translation to a range of critical and emerging fields. https://t.co/tAauSzRFWD
3
29
203
We are hiring Senior and Principal Researchers within Microsoft Copilot Tuning. https://t.co/ZjioyOMRrt
0
1
9
Introducing 𝐉𝐚𝐢𝐥𝐛𝐫𝐞𝐚𝐤 𝐃𝐢𝐬𝐭𝐢𝐥𝐥𝐚𝐭𝐢𝐨𝐧 🧨 (EMNLP '25 Findings) We propose a generate-then-select pipeline to "distill" effective jailbreak attacks into safety benchmarks, ensuring eval results are reproducible and robust to benchmark saturation & contamination🧵
1
17
32
From now on in my advising meetings, any negative result will be met with my response of "think deeper"
We significantly increased the rate limits to reasoning model by popular demand. If correctness is really important for you ask the model to “think deeper” or select “gpt5 thinking” in the model picker, this uses a higher reasoning effort than when you are auto switched to
1
2
24
I am growing an R&D team around Copilot Tuning, a newly announced effort that supports adaptation at a customer-specific level. Join us! https://t.co/kVocnuTrKN We collaborate with a crack team of eng and scientists that support the product, also growing! https://t.co/typyUXfQ8g
0
14
73
Ettin, a two-headed giant ... language model https://t.co/0XK2Q8UECN
en.wikipedia.org
Special thanks to @jhuclsp for amazing collaborators Kathryn Ricci @ruyimarone @ben_vandurme @lawrie_dawn, and LightOn with @antoine_chaffin! And this project wouldn't exist without the efforts of ModernBERT (@benjamin_warner @bclavie @jeremyphoward, many more) so 🙏 them also
0
3
8
Will continues to drive great work in the modular use of adapters. From security benefits in AdapterSwap https://t.co/Cy7CFTAT5W; to RE-adapting https://t.co/ocZPoBpRrj
https://t.co/3hZQZVjVar; to the COLM '25 SpectR https://t.co/SebAdAAjhz that enables this new result LAG.
arxiv.org
Large language models (LLMs) fine-tuned for text-retrieval have demonstrated state-of-the-art results across several information retrieval (IR) benchmarks. However, supervised training for...
Check out the paper w/@ben_vandurme now on arXiv: https://t.co/UON3J2P3hN
0
0
4
A new opening for multimodal model research: https://t.co/q0sfaZuIa3 . Please apply if interested.
3
10
61
🚨Wouldn’t it be nice if your agentic search system could reason over all your docs? ✨Introducing Rank-K, a listwise reranker that benefits from test-time compute and long-context! Rank-K sets a new SoTA for reasoning-based reranking, without reasoning chains from other models.
2
28
192
2. Copilot Tuning: Copilot can now learn your company’s unique tone and language. It is all about taking that expertise you have as a firm and further amplifying it so everyone has access.
1
57
990
🚨 Our latest paper is now on ArXiv! 👻 (w/ @ben_vandurme) SpectR: Dynamically Composing LM Experts with Spectral Routing (1/4) 🧵
1
13
24
Wish you could get a Wikipedia style article for unfolding events? Introducing WikiVideo: a new multimodal task and benchmark for Wikipedia-style article generation from multiple videos!
2
14
24