Benjamin Van Durme @ben_vandurme X Profile

Benjamin Van Durme

@ben_vandurme

Followers

1K

Following

221

Media

5

Statuses

176

Johns Hopkins / Microsoft

Joined December 2008

Don't wanna be here? Send us removal request.

Benjamin Van Durme

@ben_vandurme

28 days

Glad to see mmBERT put to good use. These look like very useful tokens!

Guilherme Penedo

@gui_penedo

28 days

We have just released 📄FinePDFs-Edu, a version of FinePDFs filtered with the FineWeb-Edu approach using ModernBERT and mmBERT. 350B+ tokens of top tier mid-training data in multiple languages. You can also download the classifiers (all 69 of them!)

0

1

4

Benjamin Van Durme

@ben_vandurme

2 months

Has the week been unFAIR? I’m sorry to hear it! Did you know Microsoft is hiring in genAI across divisions? Check out their portal for a rich set of listings. Here are some examples: https://t.co/RGeLnr8lS9 https://t.co/FgTQUdmcBp

0

13

48

Benjamin Van Durme

@ben_vandurme

2 months

COLM 2025 was once of the best conferences I can recall ever attending. I heard this also from many in coffee chats and dinners. Thank you to all that put it together. I look forward to year three.

3

10

194

Liaoyaqi Wang

@LiaoyaqiW

2 months

🚀 Thrilled to share our new work: "Always Tell Me The Odds" in COLM25 LLMs struggle with accurate probability predictions, often giving coarse answers. We train decoder-based models to provide fine-grained, calibrated probabilities, significantly outperforming strong baselines!

1

4

14

Benjamin Van Durme

@ben_vandurme

2 months

Summer '26 PhD research internships at Microsoft Copilot Tuning. Continual learning, complex reasoning and retrieval, nl2code, data efficient post-training. https://t.co/HM4cKqEhgW

0

32

238

Benjamin Van Durme

@ben_vandurme

3 months

Large scale hiring of applied research scientists within Microsoft E&D.

0

3

20

Orion Weller

@orionweller

3 months

XLM-R has been SOTA for 6 years for multilingual encoders. That's an eternity in AI 🤯 Time for an upgrade. Introducing mmBERT: 2-4x faster than previous models ⚡ while even beating o3 and Gemini 2.5 Pro 🔥 + open models & training data - try it now! How did we do it? 🧵

13

65

251

Benjamin Van Durme

@ben_vandurme

3 months

A cool project that I'm happy to see appearing at TACL.

Ai2

@allen_ai

4 months

LLMs power research, decision‑making, and exploration—but most benchmarks don’t test how well they stitch together evidence across dozens (or hundreds) of sources. Meet MoNaCo, our new eval for question-answering cross‑source reasoning. 👇

0

2

15

Johns Hopkins Data Science and AI Institute

@HopkinsDSAI

4 months

#HopkinsDSAI welcomes 22 new faculty members, who join more than 150 DSAI faculty members across @JohnsHopkins in advancing the study of data science, machine learning, and #AI and translation to a range of critical and emerging fields. https://t.co/tAauSzRFWD

3

29

203

Benjamin Van Durme

@ben_vandurme

4 months

We are hiring Senior and Principal Researchers within Microsoft Copilot Tuning. https://t.co/ZjioyOMRrt

0

1

9

Jack Jingyu Zhang

@jackjingyuzhang

4 months

Introducing 𝐉𝐚𝐢𝐥𝐛𝐫𝐞𝐚𝐤 𝐃𝐢𝐬𝐭𝐢𝐥𝐥𝐚𝐭𝐢𝐨𝐧 🧨 (EMNLP '25 Findings) We propose a generate-then-select pipeline to "distill" effective jailbreak attacks into safety benchmarks, ensuring eval results are reproducible and robust to benchmark saturation & contamination🧵

1

17

32

Benjamin Van Durme

@ben_vandurme

4 months

From now on in my advising meetings, any negative result will be met with my response of "think deeper"

Yann Dubois

@yanndubs

4 months

We significantly increased the rate limits to reasoning model by popular demand. If correctness is really important for you ask the model to “think deeper” or select “gpt5 thinking” in the model picker, this uses a higher reasoning effort than when you are auto switched to

1

2

24

Benjamin Van Durme

@ben_vandurme

4 months

I am growing an R&D team around Copilot Tuning, a newly announced effort that supports adaptation at a customer-specific level. Join us! https://t.co/kVocnuTrKN We collaborate with a crack team of eng and scientists that support the product, also growing! https://t.co/typyUXfQ8g

0

14

73

Benjamin Van Durme

@ben_vandurme

5 months

Ettin, a two-headed giant ... language model https://t.co/0XK2Q8UECN

en.wikipedia.org

Orion Weller

@orionweller

5 months

Special thanks to @jhuclsp for amazing collaborators Kathryn Ricci @ruyimarone @ben_vandurme @lawrie_dawn, and LightOn with @antoine_chaffin! And this project wouldn't exist without the efforts of ModernBERT (@benjamin_warner @bclavie @jeremyphoward, many more) so 🙏 them also

0

3

8

Benjamin Van Durme

@ben_vandurme

5 months

Will continues to drive great work in the modular use of adapters. From security benefits in AdapterSwap https://t.co/Cy7CFTAT5W; to RE-adapting https://t.co/ocZPoBpRrj https://t.co/3hZQZVjVar; to the COLM '25 SpectR https://t.co/SebAdAAjhz that enables this new result LAG.

arxiv.org

Large language models (LLMs) fine-tuned for text-retrieval have demonstrated state-of-the-art results across several information retrieval (IR) benchmarks. However, supervised training for...

William Fleshman

@willcfleshman

5 months

Check out the paper w/@ben_vandurme now on arXiv: https://t.co/UON3J2P3hN

0

4

John Langford

@JohnCLangford

6 months

A new opening for multimodal model research: https://t.co/q0sfaZuIa3 . Please apply if interested.

3

10

61

Eugene Yang

@EYangTW

7 months

🚨Wouldn’t it be nice if your agentic search system could reason over all your docs? ✨Introducing Rank-K, a listwise reranker that benefits from test-time compute and long-context! Rank-K sets a new SoTA for reasoning-based reranking, without reasoning chains from other models.

2

28

192

Satya Nadella

@satyanadella

7 months

2. Copilot Tuning: Copilot can now learn your company’s unique tone and language. It is all about taking that expertise you have as a firm and further amplifying it so everyone has access.

1

57

990

William Fleshman

@willcfleshman

8 months

🚨 Our latest paper is now on ArXiv! 👻 (w/ @ben_vandurme) SpectR: Dynamically Composing LM Experts with Spectral Routing (1/4) 🧵

1

13

24

Alexander Martin

@alexdmartin314

8 months

Wish you could get a Wikipedia style article for unfolding events? Introducing WikiVideo: a new multimodal task and benchmark for Wikipedia-style article generation from multiple videos!

2

14

24